From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: peterz@infradead.org, willy@infradead.org,
liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
david.laight.linux@gmail.com, mhocko@suse.com, vbabka@suse.cz,
hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com,
mgorman@techsingularity.net, david@redhat.com,
peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net,
paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com,
hdanton@sina.com, hughd@google.com, lokeshgidra@google.com,
minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
souravpanda@google.com, pasha.tatashin@soleen.com,
klarasmodin@gmail.com, richard.weiyang@gmail.com,
corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@android.com,
surenb@google.com
Subject: [PATCH v9 00/17] reimplement per-vma lock as a refcount
Date: Fri, 10 Jan 2025 20:25:47 -0800 [thread overview]
Message-ID: <20250111042604.3230628-1-surenb@google.com> (raw)
Back when per-vma locks were introduces, vm_lock was moved out of
vm_area_struct in [1] because of the performance regression caused by
false cacheline sharing. Recent investigation [2] revealed that the
regressions is limited to a rather old Broadwell microarchitecture and
even there it can be mitigated by disabling adjacent cacheline
prefetching, see [3].
Splitting single logical structure into multiple ones leads to more
complicated management, extra pointer dereferences and overall less
maintainable code. When that split-away part is a lock, it complicates
things even further. With no performance benefits, there are no reasons
for this split. Merging the vm_lock back into vm_area_struct also allows
vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset.
This patchset:
1. moves vm_lock back into vm_area_struct, aligning it at the cacheline
boundary and changing the cache to be cacheline-aligned to minimize
cacheline sharing;
2. changes vm_area_struct initialization to mark new vma as detached until
it is inserted into vma tree;
3. replaces vm_lock and vma->detached flag with a reference counter;
4. regroups vm_area_struct members to fit them into 3 cachelines;
5. changes vm_area_struct cache to SLAB_TYPESAFE_BY_RCU to allow for their
reuse and to minimize call_rcu() calls.
Pagefault microbenchmarks show performance improvement:
Hmean faults/cpu-1 507926.5547 ( 0.00%) 506519.3692 * -0.28%*
Hmean faults/cpu-4 479119.7051 ( 0.00%) 481333.6802 * 0.46%*
Hmean faults/cpu-7 452880.2961 ( 0.00%) 455845.6211 * 0.65%*
Hmean faults/cpu-12 347639.1021 ( 0.00%) 352004.2254 * 1.26%*
Hmean faults/cpu-21 200061.2238 ( 0.00%) 229597.0317 * 14.76%*
Hmean faults/cpu-30 145251.2001 ( 0.00%) 164202.5067 * 13.05%*
Hmean faults/cpu-48 106848.4434 ( 0.00%) 120641.5504 * 12.91%*
Hmean faults/cpu-56 92472.3835 ( 0.00%) 103464.7916 * 11.89%*
Hmean faults/sec-1 507566.1468 ( 0.00%) 506139.0811 * -0.28%*
Hmean faults/sec-4 1880478.2402 ( 0.00%) 1886795.6329 * 0.34%*
Hmean faults/sec-7 3106394.3438 ( 0.00%) 3140550.7485 * 1.10%*
Hmean faults/sec-12 4061358.4795 ( 0.00%) 4112477.0206 * 1.26%*
Hmean faults/sec-21 3988619.1169 ( 0.00%) 4577747.1436 * 14.77%*
Hmean faults/sec-30 3909839.5449 ( 0.00%) 4311052.2787 * 10.26%*
Hmean faults/sec-48 4761108.4691 ( 0.00%) 5283790.5026 * 10.98%*
Hmean faults/sec-56 4885561.4590 ( 0.00%) 5415839.4045 * 10.85%*
Changes since v8 [4]:
- Change subject for the cover letter, per Vlastimil Babka
- Added Reviewed-by and Acked-by, per Vlastimil Babka
- Added static check for no-limit case in __refcount_add_not_zero_limited,
per David Laight
- Fixed vma_refcount_put() to call rwsem_release() unconditionally,
per Hillf Danton and Vlastimil Babka
- Use a copy of vma->vm_mm in vma_refcount_put() in case vma is freed from
under us, per Vlastimil Babka
- Removed extra rcu_read_lock()/rcu_read_unlock() in vma_end_read(),
per Vlastimil Babka
- Changed __vma_enter_locked() parameter to centralize refcount logic,
per Vlastimil Babka
- Amended description in vm_lock replacement patch explaining the effects
of the patch on vm_area_struct size, per Vlastimil Babka
- Added vm_area_struct member regrouping patch [5] into the series
- Renamed vma_copy() into vm_area_init_from(), per Liam R. Howlett
- Added a comment for vm_area_struct to update vm_area_init_from() when
adding new members, per Vlastimil Babka
- Updated a comment about unstable src->shared.rb when copying a vma in
vm_area_init_from(), per Vlastimil Babka
[1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/
[2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/
[3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/
[4] https://lore.kernel.org/all/20250109023025.2242447-1-surenb@google.com/
[5] https://lore.kernel.org/all/20241111205506.3404479-5-surenb@google.com/
Patchset applies over mm-unstable after reverting v8
(current SHA range: 235b5129cb7b - 9e6b24c58985)
Suren Baghdasaryan (17):
mm: introduce vma_start_read_locked{_nested} helpers
mm: move per-vma lock into vm_area_struct
mm: mark vma as detached until it's added into vma tree
mm: introduce vma_iter_store_attached() to use with attached vmas
mm: mark vmas detached upon exit
types: move struct rcuwait into types.h
mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail
mm: move mmap_init_lock() out of the header file
mm: uninline the main body of vma_start_write()
refcount: introduce __refcount_{add|inc}_not_zero_limited
mm: replace vm_lock and detached flag with a reference count
mm: move lesser used vma_area_struct members into the last cacheline
mm/debug: print vm_refcnt state when dumping the vma
mm: remove extra vma_numab_state_init() call
mm: prepare lock_vma_under_rcu() for vma reuse possibility
mm: make vma cache SLAB_TYPESAFE_BY_RCU
docs/mm: document latest changes to vm_lock
Documentation/mm/process_addrs.rst | 44 ++++----
include/linux/mm.h | 156 ++++++++++++++++++++++-------
include/linux/mm_types.h | 75 +++++++-------
include/linux/mmap_lock.h | 6 --
include/linux/rcuwait.h | 13 +--
include/linux/refcount.h | 24 ++++-
include/linux/slab.h | 6 --
include/linux/types.h | 12 +++
kernel/fork.c | 129 +++++++++++-------------
mm/debug.c | 12 +++
mm/init-mm.c | 1 +
mm/memory.c | 97 ++++++++++++++++--
mm/mmap.c | 3 +-
mm/userfaultfd.c | 32 +++---
mm/vma.c | 23 ++---
mm/vma.h | 15 ++-
tools/testing/vma/linux/atomic.h | 5 +
tools/testing/vma/vma_internal.h | 93 ++++++++---------
18 files changed, 465 insertions(+), 281 deletions(-)
--
2.47.1.613.gc27f4b7a9f-goog
next reply other threads:[~2025-01-11 4:26 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-11 4:25 Suren Baghdasaryan [this message]
2025-01-11 4:25 ` [PATCH v9 01/17] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 02/17] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 03/17] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 04/17] mm: introduce vma_iter_store_attached() to use with attached vmas Suren Baghdasaryan
2025-01-13 11:58 ` Lorenzo Stoakes
2025-01-13 16:31 ` Suren Baghdasaryan
2025-01-13 16:44 ` Lorenzo Stoakes
2025-01-13 16:47 ` Lorenzo Stoakes
2025-01-13 19:09 ` Suren Baghdasaryan
2025-01-14 11:38 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 05/17] mm: mark vmas detached upon exit Suren Baghdasaryan
2025-01-13 12:05 ` Lorenzo Stoakes
2025-01-13 17:02 ` Suren Baghdasaryan
2025-01-13 17:13 ` Lorenzo Stoakes
2025-01-13 19:11 ` Suren Baghdasaryan
2025-01-13 20:32 ` Vlastimil Babka
2025-01-13 20:42 ` Suren Baghdasaryan
2025-01-14 11:36 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 06/17] types: move struct rcuwait into types.h Suren Baghdasaryan
2025-01-13 14:46 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 07/17] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2025-01-13 15:25 ` Lorenzo Stoakes
2025-01-13 17:53 ` Suren Baghdasaryan
2025-01-14 11:48 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 08/17] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2025-01-13 15:27 ` Lorenzo Stoakes
2025-01-13 17:53 ` Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 09/17] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2025-01-13 15:52 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2025-01-11 6:31 ` Hillf Danton
2025-01-11 9:59 ` Suren Baghdasaryan
2025-01-11 10:00 ` Suren Baghdasaryan
2025-01-11 12:13 ` Hillf Danton
2025-01-11 17:11 ` Suren Baghdasaryan
2025-01-11 23:44 ` Hillf Danton
2025-01-12 0:31 ` Suren Baghdasaryan
2025-01-15 9:39 ` Peter Zijlstra
2025-01-16 10:52 ` Hillf Danton
2025-01-11 12:39 ` David Laight
2025-01-11 17:07 ` Matthew Wilcox
2025-01-11 18:30 ` Paul E. McKenney
2025-01-11 22:19 ` David Laight
2025-01-11 22:50 ` [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited - clang 17.0.1 bug David Laight
2025-01-12 11:37 ` David Laight
2025-01-12 17:56 ` Paul E. McKenney
2025-01-11 4:25 ` [PATCH v9 11/17] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2025-01-11 11:24 ` Mateusz Guzik
2025-01-11 20:14 ` Suren Baghdasaryan
2025-01-11 20:16 ` Suren Baghdasaryan
2025-01-11 20:31 ` Mateusz Guzik
2025-01-11 20:58 ` Suren Baghdasaryan
2025-01-11 20:38 ` Vlastimil Babka
2025-01-13 1:47 ` Wei Yang
2025-01-13 2:25 ` Wei Yang
2025-01-13 21:14 ` Suren Baghdasaryan
2025-01-13 21:08 ` Suren Baghdasaryan
2025-01-15 10:48 ` Peter Zijlstra
2025-01-15 11:13 ` Peter Zijlstra
2025-01-15 15:00 ` Suren Baghdasaryan
2025-01-15 15:35 ` Peter Zijlstra
2025-01-15 15:38 ` Peter Zijlstra
2025-01-15 16:22 ` Suren Baghdasaryan
2025-01-15 16:00 ` [PATCH] refcount: Strengthen inc_not_zero() Peter Zijlstra
2025-01-16 15:12 ` Suren Baghdasaryan
2025-01-17 15:41 ` Will Deacon
2025-01-27 14:09 ` Will Deacon
2025-01-27 19:21 ` Suren Baghdasaryan
2025-01-28 23:51 ` Suren Baghdasaryan
2025-02-06 2:52 ` [PATCH 1/1] refcount: provide ops for cases when object's memory can be reused Suren Baghdasaryan
2025-02-06 10:41 ` Vlastimil Babka
2025-02-06 3:03 ` [PATCH] refcount: Strengthen inc_not_zero() Suren Baghdasaryan
2025-02-13 23:04 ` Suren Baghdasaryan
2025-01-17 16:13 ` Matthew Wilcox
2025-01-12 2:59 ` [PATCH v9 11/17] mm: replace vm_lock and detached flag with a reference count Wei Yang
2025-01-12 17:35 ` Suren Baghdasaryan
2025-01-13 0:59 ` Wei Yang
2025-01-13 2:37 ` Wei Yang
2025-01-13 21:16 ` Suren Baghdasaryan
2025-01-13 9:36 ` Wei Yang
2025-01-13 21:18 ` Suren Baghdasaryan
2025-01-15 2:58 ` Wei Yang
2025-01-15 3:12 ` Suren Baghdasaryan
2025-01-15 12:05 ` Wei Yang
2025-01-15 15:01 ` Suren Baghdasaryan
2025-01-16 1:37 ` Wei Yang
2025-01-16 1:41 ` Suren Baghdasaryan
2025-01-16 9:10 ` Wei Yang
2025-01-11 4:25 ` [PATCH v9 12/17] mm: move lesser used vma_area_struct members into the last cacheline Suren Baghdasaryan
2025-01-13 16:15 ` Lorenzo Stoakes
2025-01-15 10:50 ` Peter Zijlstra
2025-01-15 16:39 ` Suren Baghdasaryan
2025-02-13 22:59 ` Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 13/17] mm/debug: print vm_refcnt state when dumping the vma Suren Baghdasaryan
2025-01-13 16:21 ` Lorenzo Stoakes
2025-01-13 16:35 ` Liam R. Howlett
2025-01-13 17:57 ` Suren Baghdasaryan
2025-01-14 11:41 ` Lorenzo Stoakes
2025-01-11 4:26 ` [PATCH v9 14/17] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2025-01-13 16:28 ` Lorenzo Stoakes
2025-01-13 17:56 ` Suren Baghdasaryan
2025-01-14 11:45 ` Lorenzo Stoakes
2025-01-11 4:26 ` [PATCH v9 15/17] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2025-01-15 2:27 ` Wei Yang
2025-01-15 3:15 ` Suren Baghdasaryan
2025-01-15 3:58 ` Liam R. Howlett
2025-01-15 5:41 ` Suren Baghdasaryan
2025-01-15 3:59 ` Mateusz Guzik
2025-01-15 5:47 ` Suren Baghdasaryan
2025-01-15 5:51 ` Mateusz Guzik
2025-01-15 6:41 ` Suren Baghdasaryan
2025-01-15 7:58 ` Vlastimil Babka
2025-01-15 15:10 ` Suren Baghdasaryan
2025-02-13 22:56 ` Suren Baghdasaryan
2025-01-15 12:17 ` Wei Yang
2025-01-15 21:46 ` Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 17/17] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2025-01-13 16:33 ` Lorenzo Stoakes
2025-01-13 17:56 ` Suren Baghdasaryan
2025-01-11 4:52 ` [PATCH v9 00/17] reimplement per-vma lock as a refcount Matthew Wilcox
2025-01-11 9:45 ` Suren Baghdasaryan
2025-01-13 12:14 ` Lorenzo Stoakes
2025-01-13 16:58 ` Suren Baghdasaryan
2025-01-13 17:11 ` Lorenzo Stoakes
2025-01-13 19:00 ` Suren Baghdasaryan
2025-01-14 11:35 ` Lorenzo Stoakes
2025-01-14 1:49 ` Andrew Morton
2025-01-14 2:53 ` Suren Baghdasaryan
2025-01-14 4:09 ` Andrew Morton
2025-01-14 9:09 ` Vlastimil Babka
2025-01-14 10:27 ` Hillf Danton
2025-01-14 9:47 ` Lorenzo Stoakes
2025-01-14 14:59 ` Liam R. Howlett
2025-01-14 15:54 ` Suren Baghdasaryan
2025-01-15 11:34 ` Lorenzo Stoakes
2025-01-15 15:14 ` Suren Baghdasaryan
2025-01-28 5:26 ` Shivank Garg
2025-01-28 5:50 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250111042604.3230628-1-surenb@google.com \
--to=surenb@google.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=david.laight.linux@gmail.com \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kernel-team@android.com \
--cc=klarasmodin@gmail.com \
--cc=liam.howlett@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@google.com \
--cc=mjguzik@gmail.com \
--cc=oleg@redhat.com \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulmck@kernel.org \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.