From: Wei Yang <richard.weiyang@gmail.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, peterz@infradead.org,
willy@infradead.org, liam.howlett@oracle.com,
lorenzo.stoakes@oracle.com, david.laight.linux@gmail.com,
mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org,
mjguzik@gmail.com, oliver.sang@intel.com,
mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
hughd@google.com, lokeshgidra@google.com, minchan@google.com,
jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com,
pasha.tatashin@soleen.com, klarasmodin@gmail.com,
richard.weiyang@gmail.com, corbet@lwn.net,
linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v9 11/17] mm: replace vm_lock and detached flag with a reference count
Date: Sun, 12 Jan 2025 02:59:35 +0000 [thread overview]
Message-ID: <20250112025935.7mxi3klm5ijkb73m@master> (raw)
In-Reply-To: <20250111042604.3230628-12-surenb@google.com>
On Fri, Jan 10, 2025 at 08:25:58PM -0800, Suren Baghdasaryan wrote:
>rw_semaphore is a sizable structure of 40 bytes and consumes
>considerable space for each vm_area_struct. However vma_lock has
>two important specifics which can be used to replace rw_semaphore
>with a simpler structure:
>1. Readers never wait. They try to take the vma_lock and fall back to
>mmap_lock if that fails.
>2. Only one writer at a time will ever try to write-lock a vma_lock
>because writers first take mmap_lock in write mode.
>Because of these requirements, full rw_semaphore functionality is not
>needed and we can replace rw_semaphore and the vma->detached flag with
>a refcount (vm_refcnt).
This paragraph is merged into the above one in the commit log, which may not
what you expect.
Just a format issue, not sure why they are not separated.
>When vma is in detached state, vm_refcnt is 0 and only a call to
>vma_mark_attached() can take it out of this state. Note that unlike
>before, now we enforce both vma_mark_attached() and vma_mark_detached()
>to be done only after vma has been write-locked. vma_mark_attached()
>changes vm_refcnt to 1 to indicate that it has been attached to the vma
>tree. When a reader takes read lock, it increments vm_refcnt, unless the
>top usable bit of vm_refcnt (0x40000000) is set, indicating presence of
>a writer. When writer takes write lock, it sets the top usable bit to
>indicate its presence. If there are readers, writer will wait using newly
>introduced mm->vma_writer_wait. Since all writers take mmap_lock in write
>mode first, there can be only one writer at a time. The last reader to
>release the lock will signal the writer to wake up.
>refcount might overflow if there are many competing readers, in which case
>read-locking will fail. Readers are expected to handle such failures.
>In summary:
>1. all readers increment the vm_refcnt;
>2. writer sets top usable (writer) bit of vm_refcnt;
>3. readers cannot increment the vm_refcnt if the writer bit is set;
>4. in the presence of readers, writer must wait for the vm_refcnt to drop
>to 1 (ignoring the writer bit), indicating an attached vma with no readers;
It waits until to (VMA_LOCK_OFFSET + 1) as indicates in __vma_start_write(),
if I am right.
>5. vm_refcnt overflow is handled by the readers.
>
>While this vm_lock replacement does not yet result in a smaller
>vm_area_struct (it stays at 256 bytes due to cacheline alignment), it
>allows for further size optimization by structure member regrouping
>to bring the size of vm_area_struct below 192 bytes.
>
--
Wei Yang
Help you, Help me
next prev parent reply other threads:[~2025-01-12 2:59 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-11 4:25 [PATCH v9 00/17] reimplement per-vma lock as a refcount Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 01/17] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 02/17] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 03/17] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 04/17] mm: introduce vma_iter_store_attached() to use with attached vmas Suren Baghdasaryan
2025-01-13 11:58 ` Lorenzo Stoakes
2025-01-13 16:31 ` Suren Baghdasaryan
2025-01-13 16:44 ` Lorenzo Stoakes
2025-01-13 16:47 ` Lorenzo Stoakes
2025-01-13 19:09 ` Suren Baghdasaryan
2025-01-14 11:38 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 05/17] mm: mark vmas detached upon exit Suren Baghdasaryan
2025-01-13 12:05 ` Lorenzo Stoakes
2025-01-13 17:02 ` Suren Baghdasaryan
2025-01-13 17:13 ` Lorenzo Stoakes
2025-01-13 19:11 ` Suren Baghdasaryan
2025-01-13 20:32 ` Vlastimil Babka
2025-01-13 20:42 ` Suren Baghdasaryan
2025-01-14 11:36 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 06/17] types: move struct rcuwait into types.h Suren Baghdasaryan
2025-01-13 14:46 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 07/17] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2025-01-13 15:25 ` Lorenzo Stoakes
2025-01-13 17:53 ` Suren Baghdasaryan
2025-01-14 11:48 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 08/17] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2025-01-13 15:27 ` Lorenzo Stoakes
2025-01-13 17:53 ` Suren Baghdasaryan
2025-01-11 4:25 ` [PATCH v9 09/17] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2025-01-13 15:52 ` Lorenzo Stoakes
2025-01-11 4:25 ` [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2025-01-11 6:31 ` Hillf Danton
2025-01-11 9:59 ` Suren Baghdasaryan
2025-01-11 10:00 ` Suren Baghdasaryan
2025-01-11 12:13 ` Hillf Danton
2025-01-11 17:11 ` Suren Baghdasaryan
2025-01-11 23:44 ` Hillf Danton
2025-01-12 0:31 ` Suren Baghdasaryan
2025-01-15 9:39 ` Peter Zijlstra
2025-01-16 10:52 ` Hillf Danton
2025-01-11 12:39 ` David Laight
2025-01-11 17:07 ` Matthew Wilcox
2025-01-11 18:30 ` Paul E. McKenney
2025-01-11 22:19 ` David Laight
2025-01-11 22:50 ` [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited - clang 17.0.1 bug David Laight
2025-01-12 11:37 ` David Laight
2025-01-12 17:56 ` Paul E. McKenney
2025-01-11 4:25 ` [PATCH v9 11/17] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2025-01-11 11:24 ` Mateusz Guzik
2025-01-11 20:14 ` Suren Baghdasaryan
2025-01-11 20:16 ` Suren Baghdasaryan
2025-01-11 20:31 ` Mateusz Guzik
2025-01-11 20:58 ` Suren Baghdasaryan
2025-01-11 20:38 ` Vlastimil Babka
2025-01-13 1:47 ` Wei Yang
2025-01-13 2:25 ` Wei Yang
2025-01-13 21:14 ` Suren Baghdasaryan
2025-01-13 21:08 ` Suren Baghdasaryan
2025-01-15 10:48 ` Peter Zijlstra
2025-01-15 11:13 ` Peter Zijlstra
2025-01-15 15:00 ` Suren Baghdasaryan
2025-01-15 15:35 ` Peter Zijlstra
2025-01-15 15:38 ` Peter Zijlstra
2025-01-15 16:22 ` Suren Baghdasaryan
2025-01-15 16:00 ` [PATCH] refcount: Strengthen inc_not_zero() Peter Zijlstra
2025-01-16 15:12 ` Suren Baghdasaryan
2025-01-17 15:41 ` Will Deacon
2025-01-27 14:09 ` Will Deacon
2025-01-27 19:21 ` Suren Baghdasaryan
2025-01-28 23:51 ` Suren Baghdasaryan
2025-02-06 2:52 ` [PATCH 1/1] refcount: provide ops for cases when object's memory can be reused Suren Baghdasaryan
2025-02-06 10:41 ` Vlastimil Babka
2025-02-06 3:03 ` [PATCH] refcount: Strengthen inc_not_zero() Suren Baghdasaryan
2025-02-13 23:04 ` Suren Baghdasaryan
2025-01-17 16:13 ` Matthew Wilcox
2025-01-12 2:59 ` Wei Yang [this message]
2025-01-12 17:35 ` [PATCH v9 11/17] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2025-01-13 0:59 ` Wei Yang
2025-01-13 2:37 ` Wei Yang
2025-01-13 21:16 ` Suren Baghdasaryan
2025-01-13 9:36 ` Wei Yang
2025-01-13 21:18 ` Suren Baghdasaryan
2025-01-15 2:58 ` Wei Yang
2025-01-15 3:12 ` Suren Baghdasaryan
2025-01-15 12:05 ` Wei Yang
2025-01-15 15:01 ` Suren Baghdasaryan
2025-01-16 1:37 ` Wei Yang
2025-01-16 1:41 ` Suren Baghdasaryan
2025-01-16 9:10 ` Wei Yang
2025-01-11 4:25 ` [PATCH v9 12/17] mm: move lesser used vma_area_struct members into the last cacheline Suren Baghdasaryan
2025-01-13 16:15 ` Lorenzo Stoakes
2025-01-15 10:50 ` Peter Zijlstra
2025-01-15 16:39 ` Suren Baghdasaryan
2025-02-13 22:59 ` Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 13/17] mm/debug: print vm_refcnt state when dumping the vma Suren Baghdasaryan
2025-01-13 16:21 ` Lorenzo Stoakes
2025-01-13 16:35 ` Liam R. Howlett
2025-01-13 17:57 ` Suren Baghdasaryan
2025-01-14 11:41 ` Lorenzo Stoakes
2025-01-11 4:26 ` [PATCH v9 14/17] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2025-01-13 16:28 ` Lorenzo Stoakes
2025-01-13 17:56 ` Suren Baghdasaryan
2025-01-14 11:45 ` Lorenzo Stoakes
2025-01-11 4:26 ` [PATCH v9 15/17] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2025-01-15 2:27 ` Wei Yang
2025-01-15 3:15 ` Suren Baghdasaryan
2025-01-15 3:58 ` Liam R. Howlett
2025-01-15 5:41 ` Suren Baghdasaryan
2025-01-15 3:59 ` Mateusz Guzik
2025-01-15 5:47 ` Suren Baghdasaryan
2025-01-15 5:51 ` Mateusz Guzik
2025-01-15 6:41 ` Suren Baghdasaryan
2025-01-15 7:58 ` Vlastimil Babka
2025-01-15 15:10 ` Suren Baghdasaryan
2025-02-13 22:56 ` Suren Baghdasaryan
2025-01-15 12:17 ` Wei Yang
2025-01-15 21:46 ` Suren Baghdasaryan
2025-01-11 4:26 ` [PATCH v9 17/17] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2025-01-13 16:33 ` Lorenzo Stoakes
2025-01-13 17:56 ` Suren Baghdasaryan
2025-01-11 4:52 ` [PATCH v9 00/17] reimplement per-vma lock as a refcount Matthew Wilcox
2025-01-11 9:45 ` Suren Baghdasaryan
2025-01-13 12:14 ` Lorenzo Stoakes
2025-01-13 16:58 ` Suren Baghdasaryan
2025-01-13 17:11 ` Lorenzo Stoakes
2025-01-13 19:00 ` Suren Baghdasaryan
2025-01-14 11:35 ` Lorenzo Stoakes
2025-01-14 1:49 ` Andrew Morton
2025-01-14 2:53 ` Suren Baghdasaryan
2025-01-14 4:09 ` Andrew Morton
2025-01-14 9:09 ` Vlastimil Babka
2025-01-14 10:27 ` Hillf Danton
2025-01-14 9:47 ` Lorenzo Stoakes
2025-01-14 14:59 ` Liam R. Howlett
2025-01-14 15:54 ` Suren Baghdasaryan
2025-01-15 11:34 ` Lorenzo Stoakes
2025-01-15 15:14 ` Suren Baghdasaryan
2025-01-28 5:26 ` Shivank Garg
2025-01-28 5:50 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250112025935.7mxi3klm5ijkb73m@master \
--to=richard.weiyang@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=david.laight.linux@gmail.com \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kernel-team@android.com \
--cc=klarasmodin@gmail.com \
--cc=liam.howlett@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@google.com \
--cc=mjguzik@gmail.com \
--cc=oleg@redhat.com \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulmck@kernel.org \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.