From: Peter Zijlstra <peterz@infradead.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org,
mjguzik@gmail.com, oliver.sang@intel.com,
mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
hughd@google.com, lokeshgidra@google.com, minchan@google.com,
jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com,
pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net,
linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count
Date: Tue, 17 Dec 2024 11:30:35 +0100 [thread overview]
Message-ID: <20241217103035.GD11133@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAJuCfpEu_rZkC+ktWXE=rA-VenFBZR9VQ-SnVkDbXUqsd3Ys_A@mail.gmail.com>
On Mon, Dec 16, 2024 at 01:44:45PM -0800, Suren Baghdasaryan wrote:
> On Mon, Dec 16, 2024 at 1:38 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Dec 16, 2024 at 11:24:13AM -0800, Suren Baghdasaryan wrote:
> > > +static inline void vma_refcount_put(struct vm_area_struct *vma)
> > > +{
> > > + int refcnt;
> > > +
> > > + if (!__refcount_dec_and_test(&vma->vm_refcnt, &refcnt)) {
> > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_);
> > > +
> > > + if (refcnt & VMA_STATE_LOCKED)
> > > + rcuwait_wake_up(&vma->vm_mm->vma_writer_wait);
> > > + }
> > > +}
> > > +
> > > /*
> > > * Try to read-lock a vma. The function is allowed to occasionally yield false
> > > * locked result to avoid performance overhead, in which case we fall back to
> > > @@ -710,6 +728,8 @@ static inline void vma_lock_init(struct vm_area_struct *vma)
> > > */
> > > static inline bool vma_start_read(struct vm_area_struct *vma)
> > > {
> > > + int oldcnt;
> > > +
> > > /*
> > > * Check before locking. A race might cause false locked result.
> > > * We can use READ_ONCE() for the mm_lock_seq here, and don't need
> > > @@ -720,13 +740,20 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> > > if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(vma->vm_mm->mm_lock_seq.sequence))
> > > return false;
> > >
> > > +
> > > + rwsem_acquire_read(&vma->vmlock_dep_map, 0, 0, _RET_IP_);
> > > + /* Limit at VMA_STATE_LOCKED - 2 to leave one count for a writer */
> > > + if (unlikely(!__refcount_inc_not_zero_limited(&vma->vm_refcnt, &oldcnt,
> > > + VMA_STATE_LOCKED - 2))) {
> > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_);
> > > return false;
> > > + }
> > > + lock_acquired(&vma->vmlock_dep_map, _RET_IP_);
> > >
> > > /*
> > > + * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> > > * False unlocked result is impossible because we modify and check
> > > + * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lock_seq
> > > * modification invalidates all existing locks.
> > > *
> > > * We must use ACQUIRE semantics for the mm_lock_seq so that if we are
> > > @@ -734,10 +761,12 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> > > * after it has been unlocked.
> > > * This pairs with RELEASE semantics in vma_end_write_all().
> > > */
> > > + if (oldcnt & VMA_STATE_LOCKED ||
> > > + unlikely(vma->vm_lock_seq == raw_read_seqcount(&vma->vm_mm->mm_lock_seq))) {
> > > + vma_refcount_put(vma);
> >
> > Suppose we have detach race with a concurrent RCU lookup like:
> >
> > vma = mas_lookup();
> >
> > vma_start_write();
> > mas_detach();
> > vma_start_read()
> > rwsem_acquire_read()
> > inc // success
> > vma_mark_detach();
> > dec_and_test // assumes 1->0
> > // is actually 2->1
> >
> > if (vm_lock_seq == vma->vm_mm_mm_lock_seq) // true
> > vma_refcount_put
> > dec_and_test() // 1->0
> > *NO* rwsem_release()
> >
>
> Yes, this is possible. I think that's not a problem until we start
> reusing the vmas and I deal with this race later in this patchset.
> I think what you described here is the same race I mention in the
> description of this patch:
> https://lore.kernel.org/all/20241216192419.2970941-14-surenb@google.com/
> I introduce vma_ensure_detached() in that patch to handle this case
> and ensure that vmas are detached before they are returned into the
> slab cache for reuse. Does that make sense?
So I just replied there, and no, I don't think it makes sense. Just put
the kmem_cache_free() in vma_refcount_put(), to be done on 0.
Anyway, my point was more about the weird entanglement of lockdep and
the refcount. Just pull the lockdep annotation out of _put() and put it
explicitly in the vma_start_read() error paths and vma_end_read().
Additionally, having vma_end_write() would allow you to put a lockdep
annotation in vma_{start,end}_write() -- which was I think the original
reason I proposed it a while back, that and having improved clarity when
reading the code, since explicitly marking the end of a section is
helpful.
next prev parent reply other threads:[~2024-12-17 10:30 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 19:24 [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 01/16] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 02/16] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 03/16] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 04/16] mm/nommu: fix the last places where vma is not locked before being attached Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 05/16] types: move struct rcuwait into types.h Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 06/16] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2024-12-17 11:31 ` Lokesh Gidra
2024-12-17 15:51 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 07/16] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 08/16] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 09/16] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2024-12-16 20:42 ` Peter Zijlstra
2024-12-16 20:53 ` Suren Baghdasaryan
2024-12-16 21:15 ` Peter Zijlstra
2024-12-16 21:53 ` Suren Baghdasaryan
2024-12-16 22:00 ` Peter Zijlstra
2024-12-16 21:37 ` Peter Zijlstra
2024-12-16 21:44 ` Suren Baghdasaryan
2024-12-17 10:30 ` Peter Zijlstra [this message]
2024-12-17 16:27 ` Suren Baghdasaryan
2024-12-18 9:41 ` Peter Zijlstra
2024-12-18 10:06 ` Peter Zijlstra
2024-12-18 15:37 ` Liam R. Howlett
2024-12-18 15:50 ` Suren Baghdasaryan
2024-12-18 16:18 ` Peter Zijlstra
2024-12-18 17:36 ` Suren Baghdasaryan
2024-12-18 17:44 ` Peter Zijlstra
2024-12-18 17:58 ` Suren Baghdasaryan
2024-12-18 19:00 ` Liam R. Howlett
2024-12-18 19:07 ` Suren Baghdasaryan
2024-12-18 19:29 ` Suren Baghdasaryan
2024-12-18 19:38 ` Liam R. Howlett
2024-12-18 20:00 ` Suren Baghdasaryan
2024-12-18 20:38 ` Liam R. Howlett
2024-12-18 21:53 ` Suren Baghdasaryan
2024-12-18 21:55 ` Suren Baghdasaryan
2024-12-19 0:35 ` Andrew Morton
2024-12-19 0:47 ` Suren Baghdasaryan
2024-12-19 9:13 ` Peter Zijlstra
2024-12-19 11:20 ` Peter Zijlstra
2024-12-19 16:17 ` Suren Baghdasaryan
2024-12-19 17:16 ` Liam R. Howlett
2024-12-19 17:42 ` Peter Zijlstra
2024-12-19 18:18 ` Liam R. Howlett
2024-12-19 18:46 ` Peter Zijlstra
2024-12-19 18:55 ` Liam R. Howlett
2024-12-20 15:22 ` Suren Baghdasaryan
2024-12-23 3:03 ` Suren Baghdasaryan
2024-12-26 17:12 ` Suren Baghdasaryan
2024-12-19 16:14 ` Suren Baghdasaryan
2024-12-19 17:23 ` Peter Zijlstra
2024-12-19 8:55 ` Peter Zijlstra
2024-12-19 16:08 ` Suren Baghdasaryan
2024-12-19 8:53 ` Peter Zijlstra
2024-12-19 16:08 ` Suren Baghdasaryan
2024-12-18 15:57 ` Suren Baghdasaryan
2024-12-18 16:13 ` Peter Zijlstra
2024-12-18 15:42 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 11/16] mm: enforce vma to be in detached state before freeing Suren Baghdasaryan
2024-12-16 21:16 ` Peter Zijlstra
2024-12-16 21:18 ` Peter Zijlstra
2024-12-16 21:57 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 12/16] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 13/16] mm: introduce vma_ensure_detached() Suren Baghdasaryan
2024-12-17 10:26 ` Peter Zijlstra
2024-12-17 15:58 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 14/16] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 15/16] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 16/16] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2024-12-16 19:39 ` [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-17 18:42 ` Andrew Morton
2024-12-17 18:49 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241217103035.GD11133@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kernel-team@android.com \
--cc=klarasmodin@gmail.com \
--cc=liam.howlett@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@google.com \
--cc=mjguzik@gmail.com \
--cc=oleg@redhat.com \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulmck@kernel.org \
--cc=peterx@redhat.com \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.