From: Liviu Dudau <liviu.dudau@arm.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>,
Dmitry Osipenko <dmitry.osipenko@collabora.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Akash Goel <akash.goel@arm.com>, Chia-I Wu <olvaffe@gmail.com>,
Rob Clark <robin.clark@oss.qualcomm.com>,
Dmitry Baryshkov <lumag@kernel.org>,
Abhinav Kumar <abhinav.kumar@linux.dev>,
Jessica Zhang <jesszhan0024@gmail.com>,
Sean Paul <sean@poorly.run>,
Marijn Suijten <marijn.suijten@somainline.org>,
linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] drm/panthor: Don't use the racy drm_gem_lru_remove() helper
Date: Thu, 7 May 2026 11:01:25 +0100 [thread overview]
Message-ID: <afxi9ab1O9J_7J1Y@e142607> (raw)
In-Reply-To: <20260506-panthor-shrinker-fixes-v1-1-e7721526de96@collabora.com>
On Wed, May 06, 2026 at 02:16:26PM +0200, Boris Brezillon wrote:
> drm_gem_lru_remove() dereference stores drm_gem_object::lru in a local
> variable that's then dereferenced to acquire the LRU lock. Because this
> assignment in done without the LRU lock held, it can race with
> drm_gem_lru_scan() where drm_gem_object::lru is temporarily assigned
> a stack-allcated LRU that goes away when leaving the function. By
> the time we dereference this local lru variable, the object might already
> be gone.
>
> It feels like drm_gem_lru_move_tail() was never meant to be used this
> way, because there's no easy way we can avoid this race unless we defer
> the locking to the caller. Let's add an explicit LRU for unreclaimable
> BOs instead, and have all BOs added to this LRU at creation time.
I would argue that drm_gem_lru_scan() is broken by design. If you're going
to release the LRU lock in the middle of a loop you can expect that someone
will get hold of your stack-allocated LRU and end up picking the pieces.
This patch is fine in itself by trying to avoid stepping into the fight,
but I think we should also add a warning in drm_gem_lru_scan() for future
users to be aware of the dangers.
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Best regards,
Liviu
>
> Fixes: fb42964e2a76 ("drm/panthor: Add a GEM shrinker")
> Reported-by: Chia-I Wu <olvaffe@gmail.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.h | 10 ++++++++++
> drivers/gpu/drm/panthor/panthor_gem.c | 5 ++++-
> 2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 4e4607bca7cc..45b71546f83c 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -190,6 +190,16 @@ struct panthor_device {
> /** @reclaim.lock: Lock protecting all LRUs */
> struct mutex lock;
>
> + /**
> + * @reclaim.unreclaimable: unreclaimable BOs
> + *
> + * Either the BO is unreclaimable because it has no pages allocated,
> + * or it's unreclaimable because pages are pinned.
> + *
> + * All BOs start in that list at creation time.
> + */
> + struct drm_gem_lru unreclaimable;
> +
> /**
> * @reclaim.unused: BOs with unused pages
> *
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> index 13295d7a593d..8e31740126e7 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.c
> +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> @@ -204,7 +204,7 @@ void panthor_gem_update_reclaim_state_locked(struct panthor_gem_object *bo,
> drm_gem_lru_move_tail(&ptdev->reclaim.gpu_mapped_shared, &bo->base);
> break;
> case PANTHOR_GEM_UNRECLAIMABLE:
> - drm_gem_lru_remove(&bo->base);
> + drm_gem_lru_move_tail(&ptdev->reclaim.unreclaimable, &bo->base);
> break;
> default:
> drm_WARN(&ptdev->base, true, "invalid GEM reclaim state (%d)\n", new_state);
> @@ -994,6 +994,7 @@ static struct panthor_gem_object *
> panthor_gem_create(struct drm_device *dev, size_t size, uint32_t flags,
> struct panthor_vm *exclusive_vm, u32 usage_flags)
> {
> + struct panthor_device *ptdev = container_of(dev, struct panthor_device, base);
> struct panthor_gem_object *bo;
> int ret;
>
> @@ -1026,6 +1027,7 @@ panthor_gem_create(struct drm_device *dev, size_t size, uint32_t flags,
> }
>
> panthor_gem_debugfs_set_usage_flags(bo, usage_flags);
> + drm_gem_lru_move_tail(&ptdev->reclaim.unreclaimable, &bo->base);
> return bo;
>
> err_put:
> @@ -1551,6 +1553,7 @@ int panthor_gem_shrinker_init(struct panthor_device *ptdev)
> return ret;
>
> INIT_LIST_HEAD(&ptdev->reclaim.vms);
> + drm_gem_lru_init(&ptdev->reclaim.unreclaimable, &ptdev->reclaim.lock);
> drm_gem_lru_init(&ptdev->reclaim.unused, &ptdev->reclaim.lock);
> drm_gem_lru_init(&ptdev->reclaim.mmapped, &ptdev->reclaim.lock);
> drm_gem_lru_init(&ptdev->reclaim.gpu_mapped_shared, &ptdev->reclaim.lock);
>
> --
> 2.54.0
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
next prev parent reply other threads:[~2026-05-07 10:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 12:16 [PATCH 0/3] drm/panthor: Fix a race in the shrinker logic Boris Brezillon
2026-05-06 12:16 ` [PATCH 1/3] drm/panthor: Don't use the racy drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40 ` Steven Price
2026-05-06 16:25 ` Boris Brezillon
2026-05-07 10:01 ` Liviu Dudau [this message]
2026-05-07 12:10 ` Boris Brezillon
2026-05-07 14:40 ` Liviu Dudau
2026-05-07 15:03 ` Boris Brezillon
2026-05-07 15:18 ` Rob Clark
2026-05-06 12:16 ` [PATCH 2/3] drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release() Boris Brezillon
2026-05-06 13:21 ` Rob Clark
2026-05-06 14:33 ` Boris Brezillon
2026-05-07 10:18 ` Liviu Dudau
2026-05-07 12:46 ` Boris Brezillon
2026-05-07 21:38 ` Rob Clark
2026-05-08 8:41 ` Boris Brezillon
2026-05-08 13:49 ` Rob Clark
2026-05-06 12:16 ` [PATCH 3/3] drm/gem: Stop exposing the racy/unsafe drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40 ` Steven Price
2026-05-07 10:20 ` Liviu Dudau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afxi9ab1O9J_7J1Y@e142607 \
--to=liviu.dudau@arm.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=akash.goel@arm.com \
--cc=boris.brezillon@collabora.com \
--cc=dmitry.osipenko@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=jesszhan0024@gmail.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=mripard@kernel.org \
--cc=olvaffe@gmail.com \
--cc=robin.clark@oss.qualcomm.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.