From: Matthew Brost <matthew.brost@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: phasta@kernel.org, dri-devel@lists.freedesktop.org,
kernel-dev@igalia.com, "Dan Carpenter" <dan.carpenter@linaro.org>,
"Christian König" <christian.koenig@amd.com>,
"Rob Clark" <robdclark@chromium.org>,
"Daniel Vetter" <daniel.vetter@ffwll.ch>,
"Danilo Krummrich" <dakr@kernel.org>,
"Christian König" <ckoenig.leichtzumerken@gmail.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies
Date: Mon, 13 Oct 2025 10:05:56 -0700 [thread overview]
Message-ID: <aO0xdLWYdPOKE9r2@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <fa1cf2f6-a82d-46f4-b1ec-b07e678cad76@igalia.com>
On Mon, Oct 13, 2025 at 06:01:49PM +0100, Tvrtko Ursulin wrote:
>
> On 13/10/2025 15:39, Philipp Stanner wrote:
> > On Fri, 2025-10-03 at 10:26 +0100, Tvrtko Ursulin wrote:
> > > Drm_sched_job_add_dependency() consumes the fence reference both on
> >
> > s/D/d
>
> I cannot find the source right now but I am 99% sure someone pulled it on me
> in the past and educated me sentences should always start with a capital
> letter, even when it is not a proper word. Because in the past I was always
> doing the same as you suggest. I don't mind TBH.
Drm_sched_job_add_dependency breaks grep / searches, so I'm with Philipp s/D/d
Matt
> > > success and failure, so in the latter case the dma_fence_put() on the
> > > error path (xarray failed to expand) is a double free.
> > >
> > > Interestingly this bug appears to have been present ever since
> > > ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code back
> > > then looked like this:
> > >
> > > drm_sched_job_add_implicit_dependencies():
> > > ...
> > > for (i = 0; i < fence_count; i++) {
> > > ret = drm_sched_job_add_dependency(job, fences[i]);
> > > if (ret)
> > > break;
> > > }
> > >
> > > for (; i < fence_count; i++)
> > > dma_fence_put(fences[i]);
> > >
> > > Which means for the failing 'i' the dma_fence_put was already a double
> > > free. Possibly there were no users at that time, or the test cases were
> > > insufficient to hit it.
> > >
> > > The bug was then only noticed and fixed after
> > > 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
> > > landed, with its fixup of
> > > 4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").
> > >
> > > At that point it was a slightly different flavour of a double free, which
> > > 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
> > > noticed and attempted to fix.
> > >
> > > But it only moved the double free from happening inside the
> > > drm_sched_job_add_dependency(), when releasing the reference not yet
> > > obtained, to the caller, when releasing the reference already released by
> > > the former in the failure case.
> >
> > That's certainly interesting, but is there a specific reason why you
> > include all of that?
>
> Yes, because bug was attempted to be fixed two times already. So it is IMO
> worth having a write up on the history.
> > The code is as is, and AFAICS it's just a bug stemming from original
> > bugs present and then refactorings happening.
> >
> > I would at least remove the old 'implicit_dependencies' function from
> > the commit message. It's just confusing and makes one look for that in
> > the current code or patch.It does say "back then" and it references the
> > commit so shouldn't be
> confusing.
>
> The discussion is long for an additional reason that Fixes: targets not the
> original commit which did not get this right, but the last change which
> tried to fix it but did not. It sounded more logical to continue the chain
> on fixes but dunno. Could replace it with the original one just as well.
> > > As such it is not easy to identify the right target for the fixes tag so
> > > lets keep it simple and just continue the chain.
> > >
> > > We also drop the misleading comment about additional reference, since it
> > > is not additional but the only one from the point of view of dependency
> > > tracking.
> >
> >
> > IMO that comment is nonsense. It's useless, too, because I can *see*
> > that a reference is being taken there, but not *why*.
> >
> > Argh, these comments. See also my commit 72ebc18b34993
> >
> >
> > Anyways. Removing it is fine, but adding a better comment is better.
> > See below.
> >
> > >
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > > Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
> > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> >
> > Is there an error report that could be included here with a Closes:
> > tag?No, it did not come from any such system. Could perhaps put a link
> > to
> the report as Reference:
> https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/
>
> ?
>
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Cc: Rob Clark <robdclark@chromium.org>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Danilo Krummrich <dakr@kernel.org>
> > > Cc: Philipp Stanner <phasta@kernel.org>
> > > Cc: "Christian König" <ckoenig.leichtzumerken@gmail.com>
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: <stable@vger.kernel.org> # v5.16+
> > > ---
> > > drivers/gpu/drm/scheduler/sched_main.c | 14 +++++---------
> > > 1 file changed, 5 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 46119aacb809..aff34240f230 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -960,20 +960,16 @@ int drm_sched_job_add_resv_dependencies(struct drm_sched_job *job,
> > > {
> > > struct dma_resv_iter cursor;
> > > struct dma_fence *fence;
> > > - int ret;
> > > + int ret = 0;
> > > dma_resv_assert_held(resv);
> > > dma_resv_for_each_fence(&cursor, resv, usage, fence) {
> > > - /* Make sure to grab an additional ref on the added fence */
> > > - dma_fence_get(fence);
> > > - ret = drm_sched_job_add_dependency(job, fence);
> > > - if (ret) {
> > > - dma_fence_put(fence);
> > > - return ret;
> > > - }
> > > + ret = drm_sched_job_add_dependency(job, dma_fence_get(fence));
> >
> > You still take a reference as before, but there is no comment anymore.
> > Can you add one explaining why a new reference is taken here?
> >
> > I guess it will be something like "This needs a new reference for the
> > job", since you cannot rely on the one from resv.
>
> I was mulling it over but then thought the kerneldoc for
> drm_sched_job_add_dependency() is good enough since it explains the function
> always consumes the reference and dma_resv_for_each_fence() does not hold
> one over the iteration. I can add it.
>
> >
> > > + if (ret)
> > > + break;
> > > }
> > > - return 0;
> > > + return ret;
> >
> >
> > That's an unnecessarily enlargement of the git diff because of style,
> > isn't it? Better keep the diff minimal here for git blame.
>
> Given the relative size of the diff versus the function itself it did not
> look like a big deal to reduce to one return statement while touching it,
> but I can keep two returns just as well, no problem.
>
> Regards,
>
> Tvrtko
>
prev parent reply other threads:[~2025-10-13 17:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-03 9:26 [PATCH] drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies Tvrtko Ursulin
2025-10-13 12:21 ` Tvrtko Ursulin
2025-10-13 14:39 ` Philipp Stanner
2025-10-13 17:01 ` Tvrtko Ursulin
2025-10-13 17:05 ` Matthew Brost [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aO0xdLWYdPOKE9r2@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=christian.koenig@amd.com \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dakr@kernel.org \
--cc=dan.carpenter@linaro.org \
--cc=daniel.vetter@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=kernel-dev@igalia.com \
--cc=phasta@kernel.org \
--cc=robdclark@chromium.org \
--cc=stable@vger.kernel.org \
--cc=tvrtko.ursulin@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox