From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: akpm@linux-foundation.org, david@kernel.org,
catalin.marinas@arm.com, will@kernel.org
Cc: lorenzo.stoakes@oracle.com, ryan.roberts@arm.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, riel@surriel.com,
harry.yoo@oracle.com, jannh@google.com, willy@infradead.org,
baohua@kernel.org, dev.jain@arm.com,
baolin.wang@linux.alibaba.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes()
Date: Mon, 9 Feb 2026 22:07:27 +0800 [thread overview]
Message-ID: <ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com> (raw)
In-Reply-To: <cover.1770645603.git.baolin.wang@linux.alibaba.com>
Implement the Arm64 architecture-specific clear_flush_young_ptes() to enable
batched checking of young flags and TLB flushing, improving performance during
large folio reclamation.
Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
33% performance improvement on my Arm64 32-core server (and 10%+ improvement
on my X86 machine). Meanwhile, the hotspot folio_check_references() dropped
from approximately 35% to around 5%.
W/o patchset:
real 0m1.518s
user 0m0.000s
sys 0m1.518s
W/ patchset:
real 0m1.018s
user 0m0.000s
sys 0m1.018s
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
arch/arm64/include/asm/pgtable.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3dabf5ea17fa..a17eb8a76788 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1838,6 +1838,17 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
return contpte_clear_flush_young_ptes(vma, addr, ptep, 1);
}
+#define clear_flush_young_ptes clear_flush_young_ptes
+static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep,
+ unsigned int nr)
+{
+ if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
+ return __ptep_clear_flush_young(vma, addr, ptep);
+
+ return contpte_clear_flush_young_ptes(vma, addr, ptep, nr);
+}
+
#define wrprotect_ptes wrprotect_ptes
static __always_inline void wrprotect_ptes(struct mm_struct *mm,
unsigned long addr, pte_t *ptep, unsigned int nr)
--
2.47.3
next prev parent reply other threads:[~2026-02-09 14:08 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-09 14:07 [PATCH v6 0/5] support batch checking of references and unmapping for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-02-09 15:25 ` David Hildenbrand (Arm)
2026-03-06 21:07 ` Barry Song
2026-03-07 2:22 ` Baolin Wang
2026-03-07 8:02 ` Barry Song
2026-03-10 1:37 ` Baolin Wang
2026-03-10 8:17 ` David Hildenbrand (Arm)
2026-03-16 6:25 ` Baolin Wang
2026-03-16 14:15 ` David Hildenbrand (Arm)
2026-03-25 14:36 ` Lorenzo Stoakes (Oracle)
2026-03-25 14:58 ` David Hildenbrand (Arm)
2026-03-25 15:06 ` Lorenzo Stoakes (Oracle)
2026-03-25 15:30 ` Andrew Morton
2026-03-25 15:32 ` Lorenzo Stoakes (Oracle)
2026-03-25 16:23 ` Andrew Morton
2026-03-25 16:28 ` Lorenzo Stoakes (Oracle)
2026-03-25 18:43 ` Andrew Morton
2026-03-25 18:58 ` Lorenzo Stoakes (Oracle)
2026-03-26 1:47 ` Baolin Wang
2026-03-26 5:31 ` Barry Song
2026-03-26 11:10 ` Lorenzo Stoakes (Oracle)
2026-03-26 12:04 ` Baolin Wang
2026-03-26 12:21 ` Lorenzo Stoakes (Oracle)
2026-03-27 10:20 ` Baolin Wang
2026-03-27 9:00 ` David Hildenbrand (Arm)
2026-03-17 7:30 ` Barry Song
2026-03-18 1:37 ` Baolin Wang
2026-02-09 14:07 ` [PATCH v6 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 14:07 ` [PATCH v6 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-02-09 14:07 ` Baolin Wang [this message]
2026-02-09 15:30 ` [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() David Hildenbrand (Arm)
2026-02-10 0:39 ` Baolin Wang
2026-03-06 21:20 ` Barry Song
2026-03-07 2:14 ` Baolin Wang
2026-03-07 7:41 ` Barry Song
2026-02-09 14:07 ` [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-02-09 15:31 ` David Hildenbrand (Arm)
2026-02-10 1:53 ` [PATCH v6 0/5] support batch checking of references and unmapping for " Andrew Morton
2026-02-10 2:01 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.