From: Danilo Krummrich <dakr@kernel.org>
To: phasta@kernel.org
Cc: "Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
"Lyude Paul" <lyude@redhat.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Matthew Brost" <matthew.brost@intel.com>,
"Christian König" <ckoenig.leichtzumerken@gmail.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/5] drm/sched: Warn if pending list is not empty
Date: Tue, 22 Apr 2025 16:52:54 +0200 [thread overview]
Message-ID: <aAetRm3Sbp9vzamg@cassiopeiae> (raw)
In-Reply-To: <f0ae2d411c21e799491244fe49880a4acca32918.camel@mailbox.org>
On Tue, Apr 22, 2025 at 04:16:48PM +0200, Philipp Stanner wrote:
> On Tue, 2025-04-22 at 16:08 +0200, Danilo Krummrich wrote:
> > On Tue, Apr 22, 2025 at 02:39:21PM +0100, Tvrtko Ursulin wrote:
>
> > > Sorry I don't see the argument for the claim it is relying on the
> > > internals
> > > with the re-positioned drm_sched_fini call. In that case it is
> > > fully
> > > compliant with:
> > >
> > > /**
> > > * drm_sched_fini - Destroy a gpu scheduler
> > > *
> > > * @sched: scheduler instance
> > > *
> > > * Tears down and cleans up the scheduler.
> > > *
> > > * This stops submission of new jobs to the hardware through
> > > * drm_sched_backend_ops.run_job(). Consequently,
> > > drm_sched_backend_ops.free_job()
> > > * will not be called for all jobs still in
> > > drm_gpu_scheduler.pending_list.
> > > * There is no solution for this currently. Thus, it is up to the
> > > driver to
> > > make
> > > * sure that:
> > > *
> > > * a) drm_sched_fini() is only called after for all submitted jobs
> > > * drm_sched_backend_ops.free_job() has been called or that
> > > * b) the jobs for which drm_sched_backend_ops.free_job() has not
> > > been
> > > called
> > > *
> > > * FIXME: Take care of the above problem and prevent this function
> > > from
> > > leaking
> > > * the jobs in drm_gpu_scheduler.pending_list under any
> > > circumstances.
> > >
> > > ^^^ recommended solution b).
> >
> > This has been introduced recently with commit baf4afc58314
> > ("drm/sched: Improve
> > teardown documentation") and I do not agree with this. The scheduler
> > should
> > *not* make any promises about implementation details to enable
> > drivers to abuse
> > their knowledge about component internals.
> >
> > This makes the problem *worse* as it encourages drivers to rely on
> > implementation details, making maintainability of the scheduler even
> > worse.
> >
> > For instance, what if I change the scheduler implementation, such
> > that for every
> > entry in the pending_list the scheduler allocates another internal
> > object for
> > ${something}? Then drivers would already fall apart leaking those
> > internal
> > objects.
> >
> > Now, obviously that's pretty unlikely, but I assume you get the idea.
> >
> > The b) paragraph in drm_sched_fini() should be removed for the given
> > reasons.
> >
> > AFAICS, since the introduction of this commit, driver implementations
> > haven't
> > changed in this regard, hence we should be good.
> >
> > So, for me this doesn't change the fact that every driver
> > implementation that
> > just stops the scheduler at an arbitrary point of time and tries to
> > clean things
> > up manually relying on knowledge about component internals is broken.
>
> To elaborate on that, this documentation has been written so that we at
> least have *some* documentation about the problem, instead of just
> letting new drivers run into the knife.
>
> The commit explicitly introduced the FIXME, marking those two hacky
> workarounds as undesirable.
>
> But back then we couldn't fix the problem quickly, so it was either
> document the issue at least a bit, or leave it completely undocumented.
Agreed, but b) really sounds like an invitation (or even justification) for
doing the wrong thing, let's removed it.
next prev parent reply other threads:[~2025-04-22 14:52 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-07 15:22 [PATCH 0/5] drm/sched: Fix memory leaks in drm_sched_fini() Philipp Stanner
2025-04-07 15:22 ` [PATCH 1/5] drm/sched: Fix teardown leaks with waitqueue Philipp Stanner
2025-04-17 7:49 ` Philipp Stanner
2025-04-07 15:22 ` [PATCH 2/5] drm/sched: Prevent teardown waitque from blocking too long Philipp Stanner
2025-04-07 15:22 ` [PATCH 3/5] drm/sched: Warn if pending list is not empty Philipp Stanner
2025-04-17 7:45 ` Philipp Stanner
2025-04-17 11:27 ` Tvrtko Ursulin
2025-04-17 12:11 ` Danilo Krummrich
2025-04-17 14:20 ` Tvrtko Ursulin
2025-04-17 14:48 ` Danilo Krummrich
2025-04-17 16:08 ` Tvrtko Ursulin
2025-04-17 17:07 ` Danilo Krummrich
2025-04-22 6:06 ` Philipp Stanner
2025-04-22 10:39 ` Tvrtko Ursulin
2025-04-22 11:13 ` Danilo Krummrich
2025-04-22 12:00 ` Philipp Stanner
2025-04-22 13:25 ` Tvrtko Ursulin
2025-04-22 12:07 ` Tvrtko Ursulin
2025-04-22 12:21 ` Philipp Stanner
2025-04-22 12:32 ` Danilo Krummrich
2025-04-22 13:39 ` Tvrtko Ursulin
2025-04-22 13:46 ` Philipp Stanner
2025-04-22 14:08 ` Danilo Krummrich
2025-04-22 14:16 ` Philipp Stanner
2025-04-22 14:52 ` Danilo Krummrich [this message]
2025-04-23 7:34 ` Tvrtko Ursulin
2025-04-23 8:48 ` Danilo Krummrich
2025-04-23 10:10 ` Tvrtko Ursulin
2025-04-23 10:26 ` Danilo Krummrich
2025-04-07 15:22 ` [PATCH 4/5] drm/nouveau: Add new callback for scheduler teardown Philipp Stanner
2025-04-07 15:22 ` [PATCH 5/5] drm/nouveau: Remove waitque for sched teardown Philipp Stanner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAetRm3Sbp9vzamg@cassiopeiae \
--to=dakr@kernel.org \
--cc=airlied@gmail.com \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lyude@redhat.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=phasta@kernel.org \
--cc=simona@ffwll.ch \
--cc=tvrtko.ursulin@igalia.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox