From: Boris Brezillon <boris.brezillon@collabora.com>
To: Rob Herring <robh+dt@kernel.org>
Cc: Steven Price <steven.price@arm.com>,
Tomeu Vizoso <tomeu@tomeuvizoso.net>,
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>,
stable <stable@vger.kernel.org>,
dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH 2/8] drm/panfrost: Fix a race in panfrost_ioctl_madvise()
Date: Fri, 6 Dec 2019 09:08:09 +0100 [thread overview]
Message-ID: <20191206090809.0832f4aa@collabora.com> (raw)
In-Reply-To: <20191206085327.66a8c479@collabora.com>
On Fri, 6 Dec 2019 08:53:27 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:
> On Thu, 5 Dec 2019 17:08:02 -0600
> Rob Herring <robh+dt@kernel.org> wrote:
>
> > On Fri, Nov 29, 2019 at 8:33 AM Boris Brezillon
> > <boris.brezillon@collabora.com> wrote:
> > >
> > > On Fri, 29 Nov 2019 14:24:48 +0000
> > > Steven Price <steven.price@arm.com> wrote:
> > >
> > > > On 29/11/2019 13:59, Boris Brezillon wrote:
> > > > > If 2 threads change the MADVISE property of the same BO in parallel we
> > > > > might end up with an shmem->madv value that's inconsistent with the
> > > > > presence of the BO in the shrinker list.
> > > >
> > > > I'm a bit worried from the point of view of user space sanity that you
> > > > observed this - but clearly the kernel should be robust!
> > >
> > > It's not something I observed, just found the race by inspecting the
> > > code, and I thought it was worth fixing it.
> >
> > I'm not so sure there's a race.
>
> I'm pretty sure there's one:
>
> T0 T1
>
> lock(pages)
> madv = 1
> unlock(pages)
>
> lock(pages)
> madv = 0
> unlock(pages)
>
> lock(shrinker)
> remove_from_list(bo)
> unlock(shrinker)
>
> lock(shrinker)
> add_to_list(bo)
> unlock(shrinker)
>
> You end up with madv = 0 and the BO is added to the list.
>
> > If there is, we still check madv value
> > when purging, so it would be harmless even if the state is
> > inconsistent.
>
> Indeed. Note that you could also have this other situation where the BO
> is marked purgeable but not present in the list. In that case it will
> never be purged, but it's kinda user space fault anyway. I agree, none
> of this problems are critical, and I'm fine leaving it unfixed as long
> as it's documented somewhere that the race exist and is harmless.
>
> >
> > > > > The easiest solution to fix that is to protect the
> > > > > drm_gem_shmem_madvise() call with the shrinker lock.
> > > > >
> > > > > Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support")
> > > > > Cc: <stable@vger.kernel.org>
> > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > >
> > > > Reviewed-by: Steven Price <steven.price@arm.com>
> > >
> > > Thanks.
> > >
> > > >
> > > > > ---
> > > > > drivers/gpu/drm/panfrost/panfrost_drv.c | 9 ++++-----
> > > > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > > index f21bc8a7ee3a..efc0a24d1f4c 100644
> > > > > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > > > > @@ -347,20 +347,19 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,
> > > > > return -ENOENT;
> > > > > }
> > > > >
> > > > > + mutex_lock(&pfdev->shrinker_lock);
> > > > > args->retained = drm_gem_shmem_madvise(gem_obj, args->madv);
> >
> > This means we now hold the shrinker_lock while we take the pages_lock.
> > Is lockdep happy with this change? I suspect not given all the fun I
> > had getting lockdep happy.
>
> I have tested with lockdep enabled and it's all good from lockdep PoV
> because the locks are taken in the same order in the madvise() and
> schinker_scan() path (first the shrinker lock, then the pages lock).
>
> Note that patch 7 introduces a deadlock in the shrinker path, but this
> is unrelated to this shrinker lock being taken earlier in madvise
> (drm_gem_put_pages() is called while the pages lock is already held).
My bad, there's no deadlock in this version, because we don't use
->pages_use_count to retain the page table (we just use a gpu_usecount
in patch 8 to prevent the purge). But I started working on a version
that uses ->pages_use_count instead of introducing yet another
refcount, and in this version I take/release a ref on the page table in
the mmu_map()/mmu_unmap() path. This causes a deadlock when GEM mappings
are teared down by the shrinker logic (because the pages lock is already
taken in panfrost_gem_purge())...
next prev parent reply other threads:[~2019-12-06 8:08 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20191129135908.2439529-1-boris.brezillon@collabora.com>
2019-11-29 13:59 ` [PATCH 1/8] drm/panfrost: Make panfrost_job_run() return an ERR_PTR() instead of NULL Boris Brezillon
2019-11-29 14:19 ` Steven Price
2019-11-29 14:31 ` Boris Brezillon
2019-11-29 14:38 ` Steven Price
2019-11-29 19:32 ` Boris Brezillon
2019-11-29 13:59 ` [PATCH 2/8] drm/panfrost: Fix a race in panfrost_ioctl_madvise() Boris Brezillon
2019-11-29 14:24 ` Steven Price
2019-11-29 14:33 ` Boris Brezillon
2019-11-29 14:40 ` Steven Price
2019-11-29 20:07 ` Daniel Vetter
2019-11-29 21:45 ` Boris Brezillon
2019-12-05 23:08 ` Rob Herring
2019-12-06 7:53 ` Boris Brezillon
2019-12-06 8:08 ` Boris Brezillon [this message]
2019-11-29 13:59 ` [PATCH 3/8] drm/panfrost: Fix a BO leak in panfrost_ioctl_mmap_bo() Boris Brezillon
2019-11-29 14:26 ` Steven Price
2019-11-29 13:59 ` [PATCH 4/8] drm/panfrost: Fix a race in panfrost_gem_free_object() Boris Brezillon
2019-11-29 14:28 ` Steven Price
2019-11-29 13:59 ` [PATCH 5/8] drm/panfrost: Open/close the perfcnt BO Boris Brezillon
2019-11-29 14:34 ` Steven Price
2019-11-29 13:59 ` [PATCH 6/8] drm/panfrost: Make sure imported/exported BOs are never purged Boris Brezillon
2019-11-29 14:14 ` Boris Brezillon
2019-11-29 14:45 ` Steven Price
2019-11-29 14:52 ` Boris Brezillon
2019-11-29 20:12 ` Daniel Vetter
2019-11-29 21:09 ` Boris Brezillon
2019-12-02 8:52 ` Daniel Vetter
2019-12-02 9:50 ` Boris Brezillon
2019-11-29 13:59 ` [PATCH 7/8] drm/panfrost: Add the panfrost_gem_mapping concept Boris Brezillon
2019-11-29 15:37 ` Steven Price
2019-11-29 20:14 ` Daniel Vetter
2019-11-29 21:36 ` Boris Brezillon
2019-12-02 8:55 ` Daniel Vetter
2019-12-02 9:13 ` Boris Brezillon
2019-12-02 9:44 ` Daniel Vetter
2019-12-04 11:41 ` Steven Price
2019-11-29 13:59 ` [PATCH 8/8] drm/panfrost: Make sure the shrinker does not reclaim referenced BOs Boris Brezillon
2019-11-29 15:48 ` Steven Price
2019-11-29 16:07 ` Boris Brezillon
2019-11-29 16:12 ` Steven Price
2019-12-02 12:50 ` Robin Murphy
2019-12-02 13:32 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191206090809.0832f4aa@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=alyssa.rosenzweig@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=robh+dt@kernel.org \
--cc=stable@vger.kernel.org \
--cc=steven.price@arm.com \
--cc=tomeu@tomeuvizoso.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).