All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Steven Price <steven.price@arm.com>,
	Liviu Dudau <liviu.dudau@arm.com>,
	Boris Brezillon <boris.brezillon@collabora.com>,
	Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Akash Goel <akash.goel@arm.com>, Chia-I Wu <olvaffe@gmail.com>,
	Rob Clark <robin.clark@oss.qualcomm.com>,
	Dmitry Baryshkov <lumag@kernel.org>,
	Abhinav Kumar <abhinav.kumar@linux.dev>,
	Jessica Zhang <jesszhan0024@gmail.com>,
	Sean Paul <sean@poorly.run>,
	Marijn Suijten <marijn.suijten@somainline.org>,
	linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release()
Date: Thu, 7 May 2026 14:46:39 +0200	[thread overview]
Message-ID: <20260507144639.68bd699f@fedora> (raw)
In-Reply-To: <20260506-panthor-shrinker-fixes-v1-2-e7721526de96@collabora.com>

On Wed, 06 May 2026 14:16:27 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> The following race can currently happen:
> 
> | Thread 0 in `drm_gem_lru_scan`               | Thread 1 in `drm_gem_object_release` |
> | -                                            | -                                    |
> | move obj1 with refcount==0 to `still_in_lru` |                                      |
> | move obj2 with refcount!=0 to `still_in_lru` |                                      |
> | mutex_unlock                                 |                                      |
> | shrink obj2                                  |                                      |
> |                                              | lru = obj1->lru; // `still_in_lru`   |
> | mutex_lock                                   |                                      |
> | move obj1 back to the original lru           |                                      |
> | mutex_unlock                                 |                                      |
> | return                                       |                                      |
> |                                              | dereference `still_in_lru`           |
> 
> Move the drm_gem_lru_move_tail_locked() after the
> kref_get_unless_zero() check so that we don't end up with a
> vanishing LRU when we hit drm_gem_object_release(). We also need to
> remove the skipped object from its LRU, otherwise we'll keep hitting
> it on subsequent loop iterations until it's actually removed from the
> list in the drm_gem_release().
> 
> Fixes: e7c2af13f811 ("drm/gem: Add LRU/shrinker helper")
> Reported-by: Chia-I Wu <olvaffe@gmail.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
> ---
>  drivers/gpu/drm/drm_gem.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index fca42949eb2b..97cf63de0112 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1660,15 +1660,19 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
>  		if (!obj)
>  			break;
>  
> -		drm_gem_lru_move_tail_locked(&still_in_lru, obj);
> -
>  		/*
>  		 * If it's in the process of being freed, gem_object->free()
> -		 * may be blocked on lock waiting to remove it.  So just
> -		 * skip it.
> +		 * may be blocked on lock waiting to remove it.  So just remove
> +		 * it from its current LRU and skip it.
>  		 */
> -		if (!kref_get_unless_zero(&obj->refcount))
> +		if (!kref_get_unless_zero(&obj->refcount)) {
> +			if (obj->lru)
> +				drm_gem_lru_remove_locked(obj);
> +

Actually, this thing is still racy, because obj->lru is dereferenced
without the lru->lock held in drm_gem_object_release(). At this point
I'm wondering if we should expose a drm_gem_lru_remove() taking the LRU
lock as an argument as suggested by Steve, and delegate the
responsibility to call drm_gem_lru_remove() to the driver. Either that,
or we make it so the LRU lock is attached to the drm_device instead of
the GEM (both MSM and panthor assume a device-wide lock for LRU
manipulation).

Rob, what's your take on this matter?

  parent reply	other threads:[~2026-05-07 12:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 12:16 [PATCH 0/3] drm/panthor: Fix a race in the shrinker logic Boris Brezillon
2026-05-06 12:16 ` [PATCH 1/3] drm/panthor: Don't use the racy drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40   ` Steven Price
2026-05-06 16:25     ` Boris Brezillon
2026-05-07 10:01   ` Liviu Dudau
2026-05-07 12:10     ` Boris Brezillon
2026-05-07 14:40       ` Liviu Dudau
2026-05-07 15:03         ` Boris Brezillon
2026-05-07 15:18           ` Rob Clark
2026-05-06 12:16 ` [PATCH 2/3] drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release() Boris Brezillon
2026-05-06 13:21   ` Rob Clark
2026-05-06 14:33     ` Boris Brezillon
2026-05-07 10:18   ` Liviu Dudau
2026-05-07 12:46   ` Boris Brezillon [this message]
2026-05-07 21:38     ` Rob Clark
2026-05-08  8:41       ` Boris Brezillon
2026-05-08 13:49         ` Rob Clark
2026-05-06 12:16 ` [PATCH 3/3] drm/gem: Stop exposing the racy/unsafe drm_gem_lru_remove() helper Boris Brezillon
2026-05-06 15:40   ` Steven Price
2026-05-07 10:20   ` Liviu Dudau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260507144639.68bd699f@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=abhinav.kumar@linux.dev \
    --cc=airlied@gmail.com \
    --cc=akash.goel@arm.com \
    --cc=dmitry.osipenko@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jesszhan0024@gmail.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liviu.dudau@arm.com \
    --cc=lumag@kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=marijn.suijten@somainline.org \
    --cc=mripard@kernel.org \
    --cc=olvaffe@gmail.com \
    --cc=robin.clark@oss.qualcomm.com \
    --cc=sean@poorly.run \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.