From: Boris Brezillon <boris.brezillon@collabora.com>
To: Liviu Dudau <liviu.dudau@arm.com>
Cc: dri-devel@lists.freedesktop.org,
Steven Price <steven.price@arm.com>,
Dmitry Osipenko <dmitry.osipenko@collabora.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Akash Goel <akash.goel@arm.com>, Chia-I Wu <olvaffe@gmail.com>,
Rob Clark <robin.clark@oss.qualcomm.com>,
Dmitry Baryshkov <lumag@kernel.org>,
Abhinav Kumar <abhinav.kumar@linux.dev>,
Jessica Zhang <jesszhan0024@gmail.com>,
Sean Paul <sean@poorly.run>,
Marijn Suijten <marijn.suijten@somainline.org>,
linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] drm/panthor: Don't use the racy drm_gem_lru_remove() helper
Date: Thu, 7 May 2026 17:03:43 +0200 [thread overview]
Message-ID: <20260507170343.044934a0@fedora> (raw)
In-Reply-To: <afykcyiURGZh0xdr@e142607>
On Thu, 7 May 2026 15:40:51 +0100
Liviu Dudau <liviu.dudau@arm.com> wrote:
> On Thu, May 07, 2026 at 02:10:27PM +0200, Boris Brezillon wrote:
> > On Thu, 7 May 2026 11:01:25 +0100
> > Liviu Dudau <liviu.dudau@arm.com> wrote:
> >
> > > On Wed, May 06, 2026 at 02:16:26PM +0200, Boris Brezillon wrote:
> > > > drm_gem_lru_remove() dereference stores drm_gem_object::lru in a local
> > > > variable that's then dereferenced to acquire the LRU lock. Because this
> > > > assignment in done without the LRU lock held, it can race with
> > > > drm_gem_lru_scan() where drm_gem_object::lru is temporarily assigned
> > > > a stack-allcated LRU that goes away when leaving the function. By
> > > > the time we dereference this local lru variable, the object might already
> > > > be gone.
> > > >
> > > > It feels like drm_gem_lru_move_tail() was never meant to be used this
> > > > way, because there's no easy way we can avoid this race unless we defer
> > > > the locking to the caller. Let's add an explicit LRU for unreclaimable
> > > > BOs instead, and have all BOs added to this LRU at creation time.
> > >
> > > I would argue that drm_gem_lru_scan() is broken by design. If you're going
> > > to release the LRU lock in the middle of a loop you can expect that someone
> > > will get hold of your stack-allocated LRU and end up picking the pieces.
> >
> > I think it's fine as long as you always use the drm_gem_lru helpers to
> > manipulate the lru field, which is true of a lot of kernel constructs.
>
> I think drm_gem_lru_scan() should never set an object's lru field to still_in_lru.
> It should set it to NULL when the object's node is removed from its lru and add
> it into still_in_lru without making the drm_gem_object->lru to point back to it.
> At the very end when we splice back the still_in_lru list back into lru's list we
> can then update obj->lru.
Then you run into another race between drm_gem_lru_scan() and
drm_gem_object_release(), where the LRU removal in _release() is
skipped because obj->lru is NULL, and all of a sudden, the still_in_lru
list has an element that's freed. Honestly, I don't think obj->lru
pointing to a stack allocated object is a problem as long as we don't
let gem users play freely with obj->lru (which we shouldn't do anyway).
>
> >
> > > This patch is fine in itself by trying to avoid stepping into the fight,
> > > but I think we should also add a warning in drm_gem_lru_scan() for future
> > > users to be aware of the dangers.
> >
> > Warning the user about what? There's nothing they can do about it, and
> > I don't even think it's unsafe per-se, unless someone goes off and
> > stores the drm_gem_object::lru value somewhere else while their shrink()
> > callback is called, and accesses it later, outside the shrinker path.
> > Given drm_gem_lru is not refcounted, there's no way one could safely
> > hold on the LRU they saw in the shrink() callback anyway, so I don't
> > think that's fair to blame the drm_gem_lru API for this kind of misuse.
>
> Yeah, that would be the warning: don't store the object's lru as you might
> get a temporary one that will become invalid after the shrinker has run.
Oh, you mean a comment explaining this should be avoided, not an actual
drm_warn(). Then yes, I think it's fine to document the expectations in
the drm_gem_object::lru doc.
next prev parent reply other threads:[~2026-05-07 15:03 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 12:16 [PATCH 0/3] drm/panthor: Fix a race in the shrinker logic Boris Brezillon
2026-05-06 12:16 ` [PATCH 1/3] drm/panthor: Don't use the racy drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40 ` Steven Price
2026-05-06 16:25 ` Boris Brezillon
2026-05-07 10:01 ` Liviu Dudau
2026-05-07 12:10 ` Boris Brezillon
2026-05-07 14:40 ` Liviu Dudau
2026-05-07 15:03 ` Boris Brezillon [this message]
2026-05-07 15:18 ` Rob Clark
2026-05-06 12:16 ` [PATCH 2/3] drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release() Boris Brezillon
2026-05-06 13:21 ` Rob Clark
2026-05-06 14:33 ` Boris Brezillon
2026-05-07 10:18 ` Liviu Dudau
2026-05-07 12:46 ` Boris Brezillon
2026-05-07 21:38 ` Rob Clark
2026-05-08 8:41 ` Boris Brezillon
2026-05-08 13:49 ` Rob Clark
2026-05-06 12:16 ` [PATCH 3/3] drm/gem: Stop exposing the racy/unsafe drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40 ` Steven Price
2026-05-07 10:20 ` Liviu Dudau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260507170343.044934a0@fedora \
--to=boris.brezillon@collabora.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=akash.goel@arm.com \
--cc=dmitry.osipenko@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=jesszhan0024@gmail.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liviu.dudau@arm.com \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=mripard@kernel.org \
--cc=olvaffe@gmail.com \
--cc=robin.clark@oss.qualcomm.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.