From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>, Luke Yang <luyang@redhat.com>
Cc: Pedro Falcato <pfalcato@suse.de>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <ljs@kernel.org>,
Vlastimil Babka <vbabka@kernel.org>, Jann Horn <jannh@google.com>,
Dev Jain <dev.jain@arm.com>,
jhladky@redhat.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, Nico Pache <npache@redhat.com>
Subject: Re: [PATCH v2 0/2] mm/mprotect: micro-optimization work
Date: Wed, 1 Apr 2026 10:25:40 +0200 [thread overview]
Message-ID: <bb40b9b5-d71e-4968-8c14-86af46e33fee@kernel.org> (raw)
In-Reply-To: <20260330130635.1d8bf3f1f9dfcafc0317f5e7@linux-foundation.org>
On 3/30/26 22:06, Andrew Morton wrote:
> On Mon, 30 Mar 2026 15:55:51 -0400 Luke Yang <luyang@redhat.com> wrote:
>
>> Thanks for working on this. I just wanted to share that we've created a
>> test kernel with your patches and tested on the following CPUs:
>>
>> --- aarch64 ---
>> Ampere Altra
>> Ampere Altra Max
>>
>> --- x86_64 ---
>> AMD EPYC 7713
>> AMD EPYC 7351
>> AMD EPYC 7542
>> AMD EPYC 7573X
>> AMD EPYC 7702
>> AMD EPYC 9754
>> Intel Xeon Gold 6126
>> Into Xeon Gold 6330
>> Intel Xeon Gold 6530
>> Intel Xeon Platinum 8351N
>> Intel Core i7-6820HQ
>>
>> --- ppc64le ---
>> IBM Power 10
>>
>> On average, we see improvements ranging from a minimum of 5% to a
>> maximum of 55%, with most improvements showing around a 25% speed up in
>> the libmicro/mprot_tw4m micro benchmark.
>
> Thanks, that's nice. I've added some of the above into the changelog
> and I took the liberty of adding your Tested-by: to both patches.
>
> fyi, regarding [2/2]: it's unclear to me whether the discussion with
> David will result in any alterations. If there's something I need to
> it always helps to lmk ;)
I think we want to get a better understanding of which exact __always_inline
is really helpful in patch #2, and where to apply the nr_ptes==1 forced
optimization.
I updated my microbenchmark I use for fork+unmap etc to measure
mprotect as well
https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/pte-mapped-folio-benchmarks.c?ref_type=heads
Running some simple tests with order-0 on 1 GiB of memory:
Upstream Linus:
./pte-mapped-folio-benchmarks 0 write-protect 5
0.005779
...
./pte-mapped-folio-benchmarks 0 write-unprotect 5
0.009113
...
With Pedro's patch #2:
$ ./pte-mapped-folio-benchmarks 0 write-protect 5
0.003941
...
$ ./pte-mapped-folio-benchmarks 0 write-unprotect 5
0.006163
...
With the patch below:
$ ./pte-mapped-folio-benchmarks 0 write-protect 5
0.003364
$ ./pte-mapped-folio-benchmarks 0 write-unprotect 5
0.005729
So patch #2 might be improved. And the forced inlining of
mprotect_folio_pte_batch() should likely not go into the same patch.
---
From cf1a2a4a6ef95ed541947f2fd9d8351bef664426 Mon Sep 17 00:00:00 2001
From: "David Hildenbrand (Arm)" <david@kernel.org>
Date: Wed, 1 Apr 2026 08:15:44 +0000
Subject: [PATCH] tmp
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
---
mm/mprotect.c | 79 +++++++++++++++++++++++++++++++--------------------
1 file changed, 48 insertions(+), 31 deletions(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index c0571445bef7..8d14c05a11a2 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -117,7 +117,7 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
}
/* Set nr_ptes number of ptes, starting from idx */
-static void prot_commit_flush_ptes(struct vm_area_struct *vma, unsigned long addr,
+static __always_inline void prot_commit_flush_ptes(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t oldpte, pte_t ptent, int nr_ptes,
int idx, bool set_write, struct mmu_gather *tlb)
{
@@ -143,7 +143,7 @@ static void prot_commit_flush_ptes(struct vm_area_struct *vma, unsigned long add
* !PageAnonExclusive() pages, starting from start_idx. Caller must enforce
* that the ptes point to consecutive pages of the same anon large folio.
*/
-static int page_anon_exclusive_sub_batch(int start_idx, int max_len,
+static __always_inline int page_anon_exclusive_sub_batch(int start_idx, int max_len,
struct page *first_page, bool expected_anon_exclusive)
{
int idx;
@@ -169,7 +169,7 @@ static int page_anon_exclusive_sub_batch(int start_idx, int max_len,
* pte of the batch. Therefore, we must individually check all pages and
* retrieve sub-batches.
*/
-static void commit_anon_folio_batch(struct vm_area_struct *vma,
+static __always_inline void commit_anon_folio_batch(struct vm_area_struct *vma,
struct folio *folio, struct page *first_page, unsigned long addr, pte_t *ptep,
pte_t oldpte, pte_t ptent, int nr_ptes, struct mmu_gather *tlb)
{
@@ -188,7 +188,7 @@ static void commit_anon_folio_batch(struct vm_area_struct *vma,
}
}
-static void set_write_prot_commit_flush_ptes(struct vm_area_struct *vma,
+static __always_inline void set_write_prot_commit_flush_ptes(struct vm_area_struct *vma,
struct folio *folio, struct page *page, unsigned long addr, pte_t *ptep,
pte_t oldpte, pte_t ptent, int nr_ptes, struct mmu_gather *tlb)
{
@@ -211,6 +211,41 @@ static void set_write_prot_commit_flush_ptes(struct vm_area_struct *vma,
commit_anon_folio_batch(vma, folio, page, addr, ptep, oldpte, ptent, nr_ptes, tlb);
}
+static __always_inline void change_present_ptes(struct mmu_gather *tlb,
+ struct vm_area_struct *vma, unsigned long addr,
+ pgprot_t newprot, unsigned long cp_flags,
+ struct folio *folio, struct page *page, pte_t *pte,
+ unsigned int nr_ptes)
+{
+ bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
+ bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE;
+ pte_t oldpte = modify_prot_start_ptes(vma, addr, pte, nr_ptes);
+ pte_t ptent = pte_modify(oldpte, newprot);
+
+ if (uffd_wp)
+ ptent = pte_mkuffd_wp(ptent);
+ else if (uffd_wp_resolve)
+ ptent = pte_clear_uffd_wp(ptent);
+
+ /*
+ * In some writable, shared mappings, we might want to catch actual
+ * write access -- see vma_wants_writenotify().
+ *
+ * In all writable, private mappings, we have to properly handle COW.
+ *
+ * In both cases, we can sometimes still change PTEs writable and avoid
+ * the write-fault handler, for example, if a PTE is already dirty and
+ * no other COW or special handling is required.
+ */
+ if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) &&
+ !pte_write(ptent))
+ set_write_prot_commit_flush_ptes(vma, folio, page, addr, pte,
+ oldpte, ptent, nr_ptes, tlb);
+ else
+ prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent,
+ nr_ptes, /* idx = */ 0, /* set_write = */ false, tlb);
+}
+
static long change_pte_range(struct mmu_gather *tlb,
struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t newprot, unsigned long cp_flags)
@@ -242,7 +277,6 @@ static long change_pte_range(struct mmu_gather *tlb,
int max_nr_ptes = (end - addr) >> PAGE_SHIFT;
struct folio *folio = NULL;
struct page *page;
- pte_t ptent;
/* Already in the desired state. */
if (prot_numa && pte_protnone(oldpte))
@@ -268,34 +302,17 @@ static long change_pte_range(struct mmu_gather *tlb,
nr_ptes = mprotect_folio_pte_batch(folio, pte, oldpte, max_nr_ptes, flags);
- oldpte = modify_prot_start_ptes(vma, addr, pte, nr_ptes);
- ptent = pte_modify(oldpte, newprot);
-
- if (uffd_wp)
- ptent = pte_mkuffd_wp(ptent);
- else if (uffd_wp_resolve)
- ptent = pte_clear_uffd_wp(ptent);
-
/*
- * In some writable, shared mappings, we might want
- * to catch actual write access -- see
- * vma_wants_writenotify().
- *
- * In all writable, private mappings, we have to
- * properly handle COW.
- *
- * In both cases, we can sometimes still change PTEs
- * writable and avoid the write-fault handler, for
- * example, if a PTE is already dirty and no other
- * COW or special handling is required.
+ * Optimize for order-0 folios by optimizing out all
+ * loops.
*/
- if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) &&
- !pte_write(ptent))
- set_write_prot_commit_flush_ptes(vma, folio, page,
- addr, pte, oldpte, ptent, nr_ptes, tlb);
- else
- prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent,
- nr_ptes, /* idx = */ 0, /* set_write = */ false, tlb);
+ if (nr_ptes == 1) {
+ change_present_ptes(tlb, vma, addr, newprot,
+ cp_flags, folio, page, pte, 1);
+ } else {
+ change_present_ptes(tlb, vma, addr, newprot,
+ cp_flags, folio, page, pte, nr_ptes);
+ }
pages += nr_ptes;
} else if (pte_none(oldpte)) {
/*
--
2.53.0
--
Cheers,
David
next prev parent reply other threads:[~2026-04-01 8:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 15:43 [PATCH v2 0/2] mm/mprotect: micro-optimization work Pedro Falcato
2026-03-24 15:43 ` [PATCH v2 1/2] mm/mprotect: move softleaf code out of the main function Pedro Falcato
2026-03-24 20:12 ` David Hildenbrand (Arm)
2026-03-24 15:43 ` [PATCH v2 2/2] mm/mprotect: special-case small folios when applying write permissions Pedro Falcato
2026-03-24 20:18 ` David Hildenbrand (Arm)
2026-03-25 11:37 ` Pedro Falcato
2026-03-30 15:16 ` David Hildenbrand (Arm)
2026-04-02 0:09 ` Andrew Morton
2026-04-02 3:44 ` Andrew Morton
2026-04-02 7:11 ` David Hildenbrand (Arm)
2026-03-30 19:55 ` [PATCH v2 0/2] mm/mprotect: micro-optimization work Luke Yang
2026-03-30 20:06 ` Andrew Morton
2026-04-01 8:25 ` David Hildenbrand (Arm) [this message]
2026-04-01 14:14 ` Pedro Falcato
2026-04-01 14:10 ` Pedro Falcato
2026-04-02 13:55 ` Luke Yang
2026-04-06 14:32 ` Luke Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bb40b9b5-d71e-4968-8c14-86af46e33fee@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dev.jain@arm.com \
--cc=jannh@google.com \
--cc=jhladky@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luyang@redhat.com \
--cc=npache@redhat.com \
--cc=pfalcato@suse.de \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox