From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
Jann Horn <jannh@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Lance Yang <ioworker0@gmail.com>, SeongJae Park <sj@kernel.org>,
Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v2 0/5] madvise cleanup
Date: Fri, 20 Jun 2025 16:33:00 +0100 [thread overview]
Message-ID: <cover.1750433500.git.lorenzo.stoakes@oracle.com> (raw)
This is a series of patches that helps address a number of historic
problems in the madvise() implementation:
* Eliminate the visitor pattern and having the code which is implemented
for both the anon_vma_name implementation and ordinary madvise()
operations use the same madvise_vma_behavior() implementation.
* Thread state through the madvise_behavior state object - this object,
very usefully introduced by SJ, is already used to transmit state through
operations. This series extends this by having all madvise() operations
use this, including anon_vma_name.
* Thread range, VMA state through madvise_behavior - This helps avoid a lot
of the confusing code around range and VMA state and again keeps things
consistent and with a single 'source of truth'.
* Addressing the very strange behaviour around the passed around struct
vm_area_struct **prev pointer - all read-only users do absolutely nothing
with the prev pointer. The only function that uses it is
madvise_update_vma(), and in all cases prev is always reset to
VMA.
Fix this by no longer having aything but madvise_update_vma() reference
prev, and having madvise_walk_vmas() update prev in each
instance. Additionally make it clear that the meaningful change in vma
state is when madvise_update_vma() potentially merges a VMA, so
explicitly retrieve the VMA in this case.
* Update and clarify the madvise_walk_vmas() function - this is a source of
a great deal of confusion, so simplify, stop using prev = NULL to signify
that the mmap lock has been dropped (!) and make that explicit, and add
some comments to explain what's going on.
v2:
* Propagated tags (thanks everyone!)
* Don't separate out __MADV_SET_ANON_VMA_NAME and __MADV_SET_CLEAR_VMA_NAME,
just use __MADV_SET_ANON_VMA_NAME as per Zi.
* Eliminate is_anon_vma_name() as no longer necessary, addressing Zi's concern
around naming another way :)
* Put mm_struct abstraction of try_vma_read_lock() into 2/5 from 3/5 as per Zi.
* Added comment about get/put anon_vma_name in madvise_vma_behavior() as per
Vlastimil.
* Renamed have_anon_name to set_new_anon_name to make it clear why we make an
exception to this get/put behaviour in madvise_vma_behavior().
* Reworded 1/4 commit message to make it clearer what's being done as per
Vlastimil.
* Avoid comma-separated decls in struct madvise_behavior_range as per Zi and
Vlastimil.
* Put fix for silly development bug (range->start comparison to end not
range->end) in 3/5 rather than 4/5 so as to eliminate it altogether,
having fixed it during development but having not put the fix in the
correct place :) as per Vlastimil.
* Rename end to last_end in madvise_walk_vmas() and added a comment for
clarity as per Vlastimil.
* Update madvise_walk_vmas() comment to no longer refer to a visitor
function.
* Separated out prev, vma fields in struct madvise_behavior as per
Vlastimil.
* Added assert on not holding VMA lock whenever mmap lock is dropped and
abstracted to mark_mmap_lock_dropped() so we always assert when we do
this, based on discussion with Vlastimil.
* Removed duplicate comment about weird -ENOMEM unmapped error behaviour.
v1:
https://lore.kernel.org/all/cover.1750363557.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (5):
mm/madvise: remove the visitor pattern and thread anon_vma state
mm/madvise: thread mm_struct through madvise_behavior
mm/madvise: thread VMA range state through madvise_behavior
mm/madvise: thread all madvise state through madv_behavior
mm/madvise: eliminate very confusing manipulation of prev VMA
include/linux/huge_mm.h | 9 +-
mm/khugepaged.c | 9 +-
mm/madvise.c | 585 +++++++++++++++++++++-------------------
3 files changed, 313 insertions(+), 290 deletions(-)
--
2.49.0
next reply other threads:[~2025-06-20 15:33 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-20 15:33 Lorenzo Stoakes [this message]
2025-06-20 15:33 ` [PATCH v2 1/5] mm/madvise: remove the visitor pattern and thread anon_vma state Lorenzo Stoakes
2025-06-20 16:57 ` Zi Yan
2025-06-20 17:12 ` SeongJae Park
2025-06-23 22:38 ` Barry Song
2025-06-25 7:43 ` David Hildenbrand
2025-06-20 15:33 ` [PATCH v2 2/5] mm/madvise: thread mm_struct through madvise_behavior Lorenzo Stoakes
2025-06-20 16:59 ` Zi Yan
2025-06-20 17:25 ` SeongJae Park
2025-06-23 22:42 ` Barry Song
2025-06-20 15:33 ` [PATCH v2 3/5] mm/madvise: thread VMA range state " Lorenzo Stoakes
2025-06-20 17:02 ` Zi Yan
2025-06-20 17:33 ` SeongJae Park
2025-06-24 1:05 ` Barry Song
2025-06-20 15:33 ` [PATCH v2 4/5] mm/madvise: thread all madvise state through madv_behavior Lorenzo Stoakes
2025-06-20 17:56 ` SeongJae Park
2025-06-20 18:01 ` Lorenzo Stoakes
2025-06-20 18:12 ` SeongJae Park
2025-06-20 15:33 ` [PATCH v2 5/5] mm/madvise: eliminate very confusing manipulation of prev VMA Lorenzo Stoakes
2025-06-20 17:13 ` Zi Yan
2025-06-20 18:10 ` SeongJae Park
2025-06-24 13:16 ` Lorenzo Stoakes
2025-06-24 17:57 ` Vlastimil Babka
2025-06-24 18:01 ` Lorenzo Stoakes
2025-06-20 17:21 ` [PATCH v2 0/5] madvise cleanup SeongJae Park
2025-06-20 17:33 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1750433500.git.lorenzo.stoakes@oracle.com \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.