public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH 0/4] mm/mprotect: micro-optimization work
@ 2026-03-19 18:31 Pedro Falcato
  2026-03-19 18:31 ` [PATCH 1/4] mm/mprotect: encourage inlining with __always_inline Pedro Falcato
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Pedro Falcato @ 2026-03-19 18:31 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes
  Cc: Pedro Falcato, Vlastimil Babka, Jann Horn, David Hildenbrand,
	Dev Jain, Luke Yang, jhladky, linux-mm, linux-kernel

After a long session of performance-cat herding, here's the first version
I am relatively ok with.

Micro-optimize the change_protection functionality and the
change_pte_range() routine. This set of functions works in an incredibly
tight loop, and even small inefficiencies are incredibly evident when spun
hundreds, thousands or hundreds of thousands of times.

There was an attempt to keep the batching functionality as much as possible,
which introduced some part of the slowness, but not all of it. Removing it
for !arm64 architectures would speed mprotect() up even further, but could
easily pessimize cases where large folios are mapped (which is not as rare
as it seems, particularly when it comes to the page cache these days).

The micro-benchmark used for the tests was [0] (usable using google/benchmark
and g++ -O2 -lbenchmark repro.cpp)

This resulted in the following (first entry is baseline):

---------------------------------------------------------
Benchmark               Time             CPU   Iterations
---------------------------------------------------------
mprotect_bench      85967 ns        85967 ns         6935
mprotect_bench      82402 ns        82402 ns         6745
mprotect_bench      86776 ns        86776 ns         8100
mprotect_bench      86463 ns        86463 ns         8087
mprotect_bench      73374 ns        73373 ns         9602


After the patchset we can observe a 14% speedup in mprotect. Wonderful
for the elusive mprotect-based workloads!

Testing & more ideas welcome. I suspect there is plenty of improvement possible
but it would require more time than what I have on my hands right now. The
entire inlined function (which inlines into change_protection()) is gigantic
- I'm not surprised this is so finnicky.


[0]: https://gist.github.com/heatd/1450d273005aba91fa5744f44dfcd933
Link: https://lore.kernel.org/all/aY8-XuFZ7zCvXulB@luyang-thinkpadp1gen7.toromso.csb/

Cc: Vlastimil Babka <vbabka@kernel.org> 
Cc: Jann Horn <jannh@google.com> 
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com> 
Cc: Luke Yang <luyang@redhat.com>
Cc: jhladky@redhat.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Pedro Falcato (4):
  mm/mprotect: encourage inlining with __always_inline
  mm/mprotect: move softleaf code out of the main function
  mm/mprotect: un-inline folio_pte_batch_flags()
  mm/mprotect: special-case small folios when applying write permissions

 mm/mprotect.c | 158 +++++++++++++++++++++++++++-----------------------
 1 file changed, 85 insertions(+), 73 deletions(-)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2026-03-23 12:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 18:31 [PATCH 0/4] mm/mprotect: micro-optimization work Pedro Falcato
2026-03-19 18:31 ` [PATCH 1/4] mm/mprotect: encourage inlining with __always_inline Pedro Falcato
2026-03-19 18:59   ` Lorenzo Stoakes (Oracle)
2026-03-19 19:00     ` Lorenzo Stoakes (Oracle)
2026-03-19 21:28   ` David Hildenbrand (Arm)
2026-03-20  9:59     ` Pedro Falcato
2026-03-20 10:08       ` David Hildenbrand (Arm)
2026-03-19 18:31 ` [PATCH 2/4] mm/mprotect: move softleaf code out of the main function Pedro Falcato
2026-03-19 19:06   ` Lorenzo Stoakes (Oracle)
2026-03-19 21:33   ` David Hildenbrand (Arm)
2026-03-20 10:04     ` Pedro Falcato
2026-03-20 10:07       ` David Hildenbrand (Arm)
2026-03-20 10:54         ` Lorenzo Stoakes (Oracle)
2026-03-19 18:31 ` [PATCH 3/4] mm/mprotect: un-inline folio_pte_batch_flags() Pedro Falcato
2026-03-19 19:14   ` Lorenzo Stoakes (Oracle)
2026-03-19 21:41     ` David Hildenbrand (Arm)
2026-03-20 10:36       ` Lorenzo Stoakes (Oracle)
2026-03-20 10:59         ` Pedro Falcato
2026-03-20 11:02           ` David Hildenbrand (Arm)
2026-03-20 11:27           ` Lorenzo Stoakes (Oracle)
2026-03-20 11:01         ` David Hildenbrand (Arm)
2026-03-20 11:45           ` Lorenzo Stoakes (Oracle)
2026-03-23 12:56             ` David Hildenbrand (Arm)
2026-03-20 10:34     ` Pedro Falcato
2026-03-20 10:51       ` Lorenzo Stoakes (Oracle)
2026-03-19 18:31 ` [PATCH 4/4] mm/mprotect: special-case small folios when applying write permissions Pedro Falcato
2026-03-19 19:17   ` Lorenzo Stoakes (Oracle)
2026-03-20 10:36     ` Pedro Falcato
2026-03-20 10:42       ` Lorenzo Stoakes (Oracle)
2026-03-19 21:43   ` David Hildenbrand (Arm)
2026-03-20 10:37     ` Pedro Falcato
2026-03-20  2:42 ` [PATCH 0/4] mm/mprotect: micro-optimization work Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox