public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites
@ 2026-03-25 11:40 Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 1/5] mm: consolidate anonymous folio PTE mapping into helpers Nico Pache
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

MAINTAINER NOTE: This is based on mm-unstable with the coresponding
patches reverted then reapplied.

The following series contains cleanups and prerequisites for my work on
khugepaged mTHP support [1]. These have been separated out to ease review.

The first patch in the series refactors the page fault folio to pte mapping
and follows a similar convention as defined by map_anon_folio_pmd_(no)pf().
This not only cleans up the current implementation of do_anonymous_page(),
but will allow for reuse later in the khugepaged mTHP implementation.

The second patch adds a small is_pmd_order() helper to check if an order is
the PMD order. This check is open-coded in a number of places. This patch
aims to clean this up and will be used more in the khugepaged mTHP work.
The third patch also adds a small DEFINE for (HPAGE_PMD_NR - 1) which is
used often across the khugepaged code.

The fourth and fifth patch come from the khugepaged mTHP patchset [1].
These two patches include the rename of function prefixes, and the
unification of khugepaged and madvise_collapse via a new
collapse_single_pmd function.

Patch 1:     refactor do_anonymous_page into map_anon_folio_pte_(no)pf
Patch 2:     add is_pmd_order helper
Patch 3:     Add define for (HPAGE_PMD_NR - 1)
Patch 4:     Refactor/rename hpage_collapse
Patch 5:     Refactoring to combine madvise_collapse and khugepaged

Testing:
- Built for x86_64, aarch64, ppc64le, and s390x
- ran all arches on test suites provided by the kernel-tests project
- selftests mm

V4 Changes:
 - added RB and SB tags
 - Patch1: commit message cleanup/additions
 - Patch1: constify two variables, and change 1<<order to 1L<<..
 - Patch1: change zero-page read path to use update_mmu_cache varient
 - Patch5: remove dead code switch statement (SCAN_PTE_MAPPED_HUGEPAGE)
 - Patch5: remove local mmap_locked from madvise_collapse()
 - Patch5: rename mmap_locked to lock_dropped in ..scan_mm_slot() and
   invert the logic. the madvise|khugepaged code now share the same 
   naming convention across both functions.
 - Patch5: add assertion to collapse_single_pmd() so both madvise_collapse
   and khugepaged assert the lock.
 - Patch5: Convert one of the VM_BUG_ON's to VM_WARN_ON

V3 - https://lore.kernel.org/all/20260311211315.450947-1-npache@redhat.com
V2 - https://lore.kernel.org/all/20260226012929.169479-1-npache@redhat.com
V1 - https://lore.kernel.org/all/20260212021835.17755-1-npache@redhat.com

A big thanks to everyone that has reviewed, tested, and participated in
the development process.

[1] - https://lore.kernel.org/all/20260122192841.128719-1-npache@redhat.com
[2] - https://lore.kernel.org/all/7334b702-f6a0-4ccf-8ac6-8426a90d1846@kernel.org/
[3] - https://lore.kernel.org/all/25723c0f-c702-44ad-93e9-1056313680cd@kernel.org/
[4] - https://lore.kernel.org/all/81ff9caa-50f2-4951-8d82-2c8dcdf3db91@kernel.org/

Nico Pache (5):
  mm: consolidate anonymous folio PTE mapping into helpers
  mm: introduce is_pmd_order helper
  mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1
  mm/khugepaged: rename hpage_collapse_* to collapse_*
  mm/khugepaged: unify khugepaged and madv_collapse with
    collapse_single_pmd()

 include/linux/huge_mm.h |   5 +
 include/linux/mm.h      |   4 +
 mm/huge_memory.c        |   2 +-
 mm/khugepaged.c         | 207 ++++++++++++++++++++--------------------
 mm/memory.c             |  63 ++++++++----
 mm/mempolicy.c          |   2 +-
 mm/mremap.c             |   2 +-
 mm/page_alloc.c         |   4 +-
 mm/shmem.c              |   3 +-
 9 files changed, 161 insertions(+), 131 deletions(-)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH mm-unstable v4 1/5] mm: consolidate anonymous folio PTE mapping into helpers
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
@ 2026-03-25 11:40 ` Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper Nico Pache
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe, Lorenzo Stoakes (Oracle)

The anonymous page fault handler in do_anonymous_page() open-codes the
sequence to map a newly allocated anonymous folio at the PTE level:
	- construct the PTE entry
	- add rmap
	- add to LRU
	- set the PTEs
	- update the MMU cache.

Introduce two helpers to consolidate this duplicated logic, mirroring the
existing map_anon_folio_pmd_nopf() pattern for PMD-level mappings:

map_anon_folio_pte_nopf(): constructs the PTE entry, takes folio
references, adds anon rmap and LRU. This function also handles the
uffd_wp that can occur in the pf variant. The future khugepaged mTHP code
calls this to handle mapping the new collapsed mTHP to its folio.

map_anon_folio_pte_pf(): extends the nopf variant to handle MM_ANONPAGES
counter updates, and mTHP fault allocation statistics for the page fault
path.

The zero-page read path in do_anonymous_page() is also untangled from the
shared setpte label, since it does not allocate a folio and should not
share the same mapping sequence as the write path. We can now leave nr_pages
undeclared at the function intialization, and use the single page
update_mmu_cache function to handle the zero page update.

This refactoring will also help reduce code duplication between mm/memory.c
and mm/khugepaged.c, and provides a clean API for PTE-level anonymous folio
mapping that can be reused by future callers (like khugpeaged mTHP support)

Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nico Pache <npache@redhat.com>
---
 include/linux/mm.h |  4 +++
 mm/memory.c        | 61 +++++++++++++++++++++++++++++++---------------
 2 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index eebf940058da..7edebadb2cb2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -5226,4 +5226,8 @@ static inline bool snapshot_page_is_faithful(const struct page_snapshot *ps)
 
 void snapshot_page(struct page_snapshot *ps, const struct page *page);
 
+void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
+		struct vm_area_struct *vma, unsigned long addr,
+		bool uffd_wp);
+
 #endif /* _LINUX_MM_H */
diff --git a/mm/memory.c b/mm/memory.c
index bd93f34b6120..6396d32c348a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5292,6 +5292,37 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
 	return folio_prealloc(vma->vm_mm, vma, vmf->address, true);
 }
 
+void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
+		struct vm_area_struct *vma, unsigned long addr,
+		bool uffd_wp)
+{
+	const unsigned int nr_pages = folio_nr_pages(folio);
+	pte_t entry = folio_mk_pte(folio, vma->vm_page_prot);
+
+	entry = pte_sw_mkyoung(entry);
+
+	if (vma->vm_flags & VM_WRITE)
+		entry = pte_mkwrite(pte_mkdirty(entry), vma);
+	if (uffd_wp)
+		entry = pte_mkuffd_wp(entry);
+
+	folio_ref_add(folio, nr_pages - 1);
+	folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
+	folio_add_lru_vma(folio, vma);
+	set_ptes(vma->vm_mm, addr, pte, entry, nr_pages);
+	update_mmu_cache_range(NULL, vma, addr, pte, nr_pages);
+}
+
+static void map_anon_folio_pte_pf(struct folio *folio, pte_t *pte,
+		struct vm_area_struct *vma, unsigned long addr, bool uffd_wp)
+{
+	const unsigned int order = folio_order(folio);
+
+	map_anon_folio_pte_nopf(folio, pte, vma, addr, uffd_wp);
+	add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1L << order);
+	count_mthp_stat(order, MTHP_STAT_ANON_FAULT_ALLOC);
+}
+
 /*
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults), and pte mapped but not yet locked.
@@ -5303,7 +5334,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	unsigned long addr = vmf->address;
 	struct folio *folio;
 	vm_fault_t ret = 0;
-	int nr_pages = 1;
+	int nr_pages;
 	pte_t entry;
 
 	/* File mapping without ->vm_ops ? */
@@ -5338,7 +5369,13 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 			pte_unmap_unlock(vmf->pte, vmf->ptl);
 			return handle_userfault(vmf, VM_UFFD_MISSING);
 		}
-		goto setpte;
+		if (vmf_orig_pte_uffd_wp(vmf))
+			entry = pte_mkuffd_wp(entry);
+		set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
+
+		/* No need to invalidate - it was non-present before */
+		update_mmu_cache(vma, addr, vmf->pte);
+		goto unlock;
 	}
 
 	/* Allocate our own private page. */
@@ -5362,11 +5399,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	 */
 	__folio_mark_uptodate(folio);
 
-	entry = folio_mk_pte(folio, vma->vm_page_prot);
-	entry = pte_sw_mkyoung(entry);
-	if (vma->vm_flags & VM_WRITE)
-		entry = pte_mkwrite(pte_mkdirty(entry), vma);
-
 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
 	if (!vmf->pte)
 		goto release;
@@ -5388,19 +5420,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 		folio_put(folio);
 		return handle_userfault(vmf, VM_UFFD_MISSING);
 	}
-
-	folio_ref_add(folio, nr_pages - 1);
-	add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
-	count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC);
-	folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
-	folio_add_lru_vma(folio, vma);
-setpte:
-	if (vmf_orig_pte_uffd_wp(vmf))
-		entry = pte_mkuffd_wp(entry);
-	set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages);
-
-	/* No need to invalidate - it was non-present before */
-	update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages);
+	map_anon_folio_pte_pf(folio, vmf->pte, vma, addr,
+			      vmf_orig_pte_uffd_wp(vmf));
 unlock:
 	if (vmf->pte)
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 1/5] mm: consolidate anonymous folio PTE mapping into helpers Nico Pache
@ 2026-03-25 11:40 ` Nico Pache
  2026-03-25 12:11   ` Lorenzo Stoakes (Oracle)
  2026-03-25 11:40 ` [PATCH mm-unstable v4 3/5] mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1 Nico Pache
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

In order to add mTHP support to khugepaged, we will often be checking if a
given order is (or is not) a PMD order. Some places in the kernel already
use this check, so lets create a simple helper function to keep the code
clean and readable.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Nico Pache <npache@redhat.com>
---
 include/linux/huge_mm.h | 5 +++++
 mm/huge_memory.c        | 2 +-
 mm/khugepaged.c         | 6 +++---
 mm/memory.c             | 2 +-
 mm/mempolicy.c          | 2 +-
 mm/page_alloc.c         | 4 ++--
 mm/shmem.c              | 3 +--
 7 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index c8799dca3b60..1258fa37e85b 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -769,6 +769,11 @@ static inline bool pmd_is_huge(pmd_t pmd)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+static inline bool is_pmd_order(unsigned int order)
+{
+	return order == HPAGE_PMD_ORDER;
+}
+
 static inline int split_folio_to_list_to_order(struct folio *folio,
 		struct list_head *list, int new_order)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2833b06d7498..b2a6060b3c20 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4118,7 +4118,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		i_mmap_unlock_read(mapping);
 out:
 	xas_destroy(&xas);
-	if (old_order == HPAGE_PMD_ORDER)
+	if (is_pmd_order(old_order))
 		count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
 	count_mthp_stat(old_order, !ret ? MTHP_STAT_SPLIT : MTHP_STAT_SPLIT_FAILED);
 	return ret;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6bd7a7c0632a..1f4609761294 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1547,7 +1547,7 @@ static enum scan_result try_collapse_pte_mapped_thp(struct mm_struct *mm, unsign
 	if (IS_ERR(folio))
 		return SCAN_PAGE_NULL;
 
-	if (folio_order(folio) != HPAGE_PMD_ORDER) {
+	if (!is_pmd_order(folio_order(folio))) {
 		result = SCAN_PAGE_COMPOUND;
 		goto drop_folio;
 	}
@@ -2030,7 +2030,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
 		 * we locked the first folio, then a THP might be there already.
 		 * This will be discovered on the first iteration.
 		 */
-		if (folio_order(folio) == HPAGE_PMD_ORDER) {
+		if (is_pmd_order(folio_order(folio))) {
 			result = SCAN_PTE_MAPPED_HUGEPAGE;
 			goto out_unlock;
 		}
@@ -2358,7 +2358,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
 			continue;
 		}
 
-		if (folio_order(folio) == HPAGE_PMD_ORDER) {
+		if (is_pmd_order(folio_order(folio))) {
 			result = SCAN_PTE_MAPPED_HUGEPAGE;
 			/*
 			 * PMD-sized THP implies that we can only try
diff --git a/mm/memory.c b/mm/memory.c
index 6396d32c348a..e44469f9cf65 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5573,7 +5573,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct folio *folio, struct page *pa
 	if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
 		return ret;
 
-	if (folio_order(folio) != HPAGE_PMD_ORDER)
+	if (!is_pmd_order(folio_order(folio)))
 		return ret;
 	page = &folio->page;
 
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ff52fb94ff27..fd08771e2057 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2449,7 +2449,7 @@ static struct page *alloc_pages_mpol(gfp_t gfp, unsigned int order,
 
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
 	    /* filter "hugepage" allocation, unless from alloc_pages() */
-	    order == HPAGE_PMD_ORDER && ilx != NO_INTERLEAVE_INDEX) {
+	    is_pmd_order(order) && ilx != NO_INTERLEAVE_INDEX) {
 		/*
 		 * For hugepage allocation and non-interleave policy which
 		 * allows the current node (or other explicitly preferred
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 915b6aef55d0..ee81f5c67c18 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -652,7 +652,7 @@ static inline unsigned int order_to_pindex(int migratetype, int order)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	bool movable;
 	if (order > PAGE_ALLOC_COSTLY_ORDER) {
-		VM_BUG_ON(order != HPAGE_PMD_ORDER);
+		VM_BUG_ON(!is_pmd_order(order));
 
 		movable = migratetype == MIGRATE_MOVABLE;
 
@@ -684,7 +684,7 @@ static inline bool pcp_allowed_order(unsigned int order)
 	if (order <= PAGE_ALLOC_COSTLY_ORDER)
 		return true;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (order == HPAGE_PMD_ORDER)
+	if (is_pmd_order(order))
 		return true;
 #endif
 	return false;
diff --git a/mm/shmem.c b/mm/shmem.c
index d00044257401..4ecefe02881d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -5532,8 +5532,7 @@ static ssize_t thpsize_shmem_enabled_store(struct kobject *kobj,
 		spin_unlock(&huge_shmem_orders_lock);
 	} else if (sysfs_streq(buf, "inherit")) {
 		/* Do not override huge allocation policy with non-PMD sized mTHP */
-		if (shmem_huge == SHMEM_HUGE_FORCE &&
-		    order != HPAGE_PMD_ORDER)
+		if (shmem_huge == SHMEM_HUGE_FORCE && !is_pmd_order(order))
 			return -EINVAL;
 
 		spin_lock(&huge_shmem_orders_lock);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH mm-unstable v4 3/5] mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 1/5] mm: consolidate anonymous folio PTE mapping into helpers Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper Nico Pache
@ 2026-03-25 11:40 ` Nico Pache
  2026-03-25 11:40 ` [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe, Lorenzo Stoakes (Oracle)

The value (HPAGE_PMD_NR - 1) is used often in the khugepaged code to
signify the limit of the max_ptes_* values. Add a define for this to
increase code readability and reuse.

Acked-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/khugepaged.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1f4609761294..c9c17bfccf0d 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -89,6 +89,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait);
  *
  * Note that these are only respected if collapse was initiated by khugepaged.
  */
+#define KHUGEPAGED_MAX_PTES_LIMIT (HPAGE_PMD_NR - 1)
 unsigned int khugepaged_max_ptes_none __read_mostly;
 static unsigned int khugepaged_max_ptes_swap __read_mostly;
 static unsigned int khugepaged_max_ptes_shared __read_mostly;
@@ -259,7 +260,7 @@ static ssize_t max_ptes_none_store(struct kobject *kobj,
 	unsigned long max_ptes_none;
 
 	err = kstrtoul(buf, 10, &max_ptes_none);
-	if (err || max_ptes_none > HPAGE_PMD_NR - 1)
+	if (err || max_ptes_none > KHUGEPAGED_MAX_PTES_LIMIT)
 		return -EINVAL;
 
 	khugepaged_max_ptes_none = max_ptes_none;
@@ -284,7 +285,7 @@ static ssize_t max_ptes_swap_store(struct kobject *kobj,
 	unsigned long max_ptes_swap;
 
 	err  = kstrtoul(buf, 10, &max_ptes_swap);
-	if (err || max_ptes_swap > HPAGE_PMD_NR - 1)
+	if (err || max_ptes_swap > KHUGEPAGED_MAX_PTES_LIMIT)
 		return -EINVAL;
 
 	khugepaged_max_ptes_swap = max_ptes_swap;
@@ -310,7 +311,7 @@ static ssize_t max_ptes_shared_store(struct kobject *kobj,
 	unsigned long max_ptes_shared;
 
 	err  = kstrtoul(buf, 10, &max_ptes_shared);
-	if (err || max_ptes_shared > HPAGE_PMD_NR - 1)
+	if (err || max_ptes_shared > KHUGEPAGED_MAX_PTES_LIMIT)
 		return -EINVAL;
 
 	khugepaged_max_ptes_shared = max_ptes_shared;
@@ -382,7 +383,7 @@ int __init khugepaged_init(void)
 		return -ENOMEM;
 
 	khugepaged_pages_to_scan = HPAGE_PMD_NR * 8;
-	khugepaged_max_ptes_none = HPAGE_PMD_NR - 1;
+	khugepaged_max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT;
 	khugepaged_max_ptes_swap = HPAGE_PMD_NR / 8;
 	khugepaged_max_ptes_shared = HPAGE_PMD_NR / 2;
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_*
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
                   ` (2 preceding siblings ...)
  2026-03-25 11:40 ` [PATCH mm-unstable v4 3/5] mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1 Nico Pache
@ 2026-03-25 11:40 ` Nico Pache
  2026-03-25 12:08   ` Lorenzo Stoakes (Oracle)
  2026-03-25 11:40 ` [PATCH mm-unstable v4 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd() Nico Pache
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

The hpage_collapse functions describe functions used by madvise_collapse
and khugepaged. remove the unnecessary hpage prefix to shorten the
function name.

Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/khugepaged.c | 60 ++++++++++++++++++++++++-------------------------
 mm/mremap.c     |  2 +-
 2 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index c9c17bfccf0d..3728a2cf133c 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -395,14 +395,14 @@ void __init khugepaged_destroy(void)
 	kmem_cache_destroy(mm_slot_cache);
 }
 
-static inline int hpage_collapse_test_exit(struct mm_struct *mm)
+static inline int collapse_test_exit(struct mm_struct *mm)
 {
 	return atomic_read(&mm->mm_users) == 0;
 }
 
-static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm)
+static inline int collapse_test_exit_or_disable(struct mm_struct *mm)
 {
-	return hpage_collapse_test_exit(mm) ||
+	return collapse_test_exit(mm) ||
 		mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm);
 }
 
@@ -436,7 +436,7 @@ void __khugepaged_enter(struct mm_struct *mm)
 	int wakeup;
 
 	/* __khugepaged_exit() must not run from under us */
-	VM_BUG_ON_MM(hpage_collapse_test_exit(mm), mm);
+	VM_BUG_ON_MM(collapse_test_exit(mm), mm);
 	if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
 		return;
 
@@ -490,7 +490,7 @@ void __khugepaged_exit(struct mm_struct *mm)
 	} else if (slot) {
 		/*
 		 * This is required to serialize against
-		 * hpage_collapse_test_exit() (which is guaranteed to run
+		 * collapse_test_exit() (which is guaranteed to run
 		 * under mmap sem read mode). Stop here (after we return all
 		 * pagetables will be destroyed) until khugepaged has finished
 		 * working on the pagetables under the mmap_lock.
@@ -589,7 +589,7 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma,
 			goto out;
 		}
 
-		/* See hpage_collapse_scan_pmd(). */
+		/* See collapse_scan_pmd(). */
 		if (folio_maybe_mapped_shared(folio)) {
 			++shared;
 			if (cc->is_khugepaged &&
@@ -840,7 +840,7 @@ static struct collapse_control khugepaged_collapse_control = {
 	.is_khugepaged = true,
 };
 
-static bool hpage_collapse_scan_abort(int nid, struct collapse_control *cc)
+static bool collapse_scan_abort(int nid, struct collapse_control *cc)
 {
 	int i;
 
@@ -875,7 +875,7 @@ static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void)
 }
 
 #ifdef CONFIG_NUMA
-static int hpage_collapse_find_target_node(struct collapse_control *cc)
+static int collapse_find_target_node(struct collapse_control *cc)
 {
 	int nid, target_node = 0, max_value = 0;
 
@@ -894,7 +894,7 @@ static int hpage_collapse_find_target_node(struct collapse_control *cc)
 	return target_node;
 }
 #else
-static int hpage_collapse_find_target_node(struct collapse_control *cc)
+static int collapse_find_target_node(struct collapse_control *cc)
 {
 	return 0;
 }
@@ -913,7 +913,7 @@ static enum scan_result hugepage_vma_revalidate(struct mm_struct *mm, unsigned l
 	enum tva_type type = cc->is_khugepaged ? TVA_KHUGEPAGED :
 				 TVA_FORCED_COLLAPSE;
 
-	if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
+	if (unlikely(collapse_test_exit_or_disable(mm)))
 		return SCAN_ANY_PROCESS;
 
 	*vmap = vma = find_vma(mm, address);
@@ -984,7 +984,7 @@ static enum scan_result check_pmd_still_valid(struct mm_struct *mm,
 
 /*
  * Bring missing pages in from swap, to complete THP collapse.
- * Only done if hpage_collapse_scan_pmd believes it is worthwhile.
+ * Only done if khugepaged_scan_pmd believes it is worthwhile.
  *
  * Called and returns without pte mapped or spinlocks held.
  * Returns result: if not SCAN_SUCCEED, mmap_lock has been released.
@@ -1070,7 +1070,7 @@ static enum scan_result alloc_charge_folio(struct folio **foliop, struct mm_stru
 {
 	gfp_t gfp = (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() :
 		     GFP_TRANSHUGE);
-	int node = hpage_collapse_find_target_node(cc);
+	int node = collapse_find_target_node(cc);
 	struct folio *folio;
 
 	folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask);
@@ -1255,7 +1255,7 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long a
 	return result;
 }
 
-static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
+static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
 		struct vm_area_struct *vma, unsigned long start_addr,
 		bool *mmap_locked, struct collapse_control *cc)
 {
@@ -1380,7 +1380,7 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
 		 * hit record.
 		 */
 		node = folio_nid(folio);
-		if (hpage_collapse_scan_abort(node, cc)) {
+		if (collapse_scan_abort(node, cc)) {
 			result = SCAN_SCAN_ABORT;
 			goto out_unmap;
 		}
@@ -1446,7 +1446,7 @@ static void collect_mm_slot(struct mm_slot *slot)
 
 	lockdep_assert_held(&khugepaged_mm_lock);
 
-	if (hpage_collapse_test_exit(mm)) {
+	if (collapse_test_exit(mm)) {
 		/* free mm_slot */
 		hash_del(&slot->hash);
 		list_del(&slot->mm_node);
@@ -1801,7 +1801,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 		if (find_pmd_or_thp_or_none(mm, addr, &pmd) != SCAN_SUCCEED)
 			continue;
 
-		if (hpage_collapse_test_exit(mm))
+		if (collapse_test_exit(mm))
 			continue;
 
 		if (!file_backed_vma_is_retractable(vma))
@@ -2317,7 +2317,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
 	return result;
 }
 
-static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
+static enum scan_result collapse_scan_file(struct mm_struct *mm,
 		unsigned long addr, struct file *file, pgoff_t start,
 		struct collapse_control *cc)
 {
@@ -2370,7 +2370,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
 		}
 
 		node = folio_nid(folio);
-		if (hpage_collapse_scan_abort(node, cc)) {
+		if (collapse_scan_abort(node, cc)) {
 			result = SCAN_SCAN_ABORT;
 			folio_put(folio);
 			break;
@@ -2424,7 +2424,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
 	return result;
 }
 
-static void khugepaged_scan_mm_slot(unsigned int progress_max,
+static void collapse_scan_mm_slot(unsigned int progress_max,
 		enum scan_result *result, struct collapse_control *cc)
 	__releases(&khugepaged_mm_lock)
 	__acquires(&khugepaged_mm_lock)
@@ -2458,7 +2458,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 		goto breakouterloop_mmap_lock;
 
 	cc->progress++;
-	if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
+	if (unlikely(collapse_test_exit_or_disable(mm)))
 		goto breakouterloop;
 
 	vma_iter_init(&vmi, mm, khugepaged_scan.address);
@@ -2466,7 +2466,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 		unsigned long hstart, hend;
 
 		cond_resched();
-		if (unlikely(hpage_collapse_test_exit_or_disable(mm))) {
+		if (unlikely(collapse_test_exit_or_disable(mm))) {
 			cc->progress++;
 			break;
 		}
@@ -2488,7 +2488,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 			bool mmap_locked = true;
 
 			cond_resched();
-			if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
+			if (unlikely(collapse_test_exit_or_disable(mm)))
 				goto breakouterloop;
 
 			VM_BUG_ON(khugepaged_scan.address < hstart ||
@@ -2501,12 +2501,12 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 
 				mmap_read_unlock(mm);
 				mmap_locked = false;
-				*result = hpage_collapse_scan_file(mm,
+				*result = collapse_scan_file(mm,
 					khugepaged_scan.address, file, pgoff, cc);
 				fput(file);
 				if (*result == SCAN_PTE_MAPPED_HUGEPAGE) {
 					mmap_read_lock(mm);
-					if (hpage_collapse_test_exit_or_disable(mm))
+					if (collapse_test_exit_or_disable(mm))
 						goto breakouterloop;
 					*result = try_collapse_pte_mapped_thp(mm,
 						khugepaged_scan.address, false);
@@ -2515,7 +2515,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 					mmap_read_unlock(mm);
 				}
 			} else {
-				*result = hpage_collapse_scan_pmd(mm, vma,
+				*result = collapse_scan_pmd(mm, vma,
 					khugepaged_scan.address, &mmap_locked, cc);
 			}
 
@@ -2547,7 +2547,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
 	 * Release the current mm_slot if this mm is about to die, or
 	 * if we scanned all vmas of this mm, or THP got disabled.
 	 */
-	if (hpage_collapse_test_exit_or_disable(mm) || !vma) {
+	if (collapse_test_exit_or_disable(mm) || !vma) {
 		/*
 		 * Make sure that if mm_users is reaching zero while
 		 * khugepaged runs here, khugepaged_exit will find
@@ -2600,7 +2600,7 @@ static void khugepaged_do_scan(struct collapse_control *cc)
 			pass_through_head++;
 		if (khugepaged_has_work() &&
 		    pass_through_head < 2)
-			khugepaged_scan_mm_slot(progress_max, &result, cc);
+			collapse_scan_mm_slot(progress_max, &result, cc);
 		else
 			cc->progress = progress_max;
 		spin_unlock(&khugepaged_mm_lock);
@@ -2845,8 +2845,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 			mmap_read_unlock(mm);
 			mmap_locked = false;
 			*lock_dropped = true;
-			result = hpage_collapse_scan_file(mm, addr, file, pgoff,
-							  cc);
+			result = collapse_scan_file(mm, addr, file, pgoff, cc);
 
 			if (result == SCAN_PAGE_DIRTY_OR_WRITEBACK && !triggered_wb &&
 			    mapping_can_writeback(file->f_mapping)) {
@@ -2860,8 +2859,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 			}
 			fput(file);
 		} else {
-			result = hpage_collapse_scan_pmd(mm, vma, addr,
-							 &mmap_locked, cc);
+			result = collapse_scan_pmd(mm, vma, addr, &mmap_locked, cc);
 		}
 		if (!mmap_locked)
 			*lock_dropped = true;
diff --git a/mm/mremap.c b/mm/mremap.c
index 8566e32d58d9..e9c8b1d05832 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -244,7 +244,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
 		goto out;
 	}
 	/*
-	 * Now new_pte is none, so hpage_collapse_scan_file() path can not find
+	 * Now new_pte is none, so collapse_scan_file() path can not find
 	 * this by traversing file->f_mapping, so there is no concurrency with
 	 * retract_page_tables(). In addition, we already hold the exclusive
 	 * mmap_lock, so this new_pte page is stable, so there is no need to get
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH mm-unstable v4 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd()
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
                   ` (3 preceding siblings ...)
  2026-03-25 11:40 ` [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
@ 2026-03-25 11:40 ` Nico Pache
  2026-03-25 11:44 ` [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Lorenzo Stoakes (Oracle)
  2026-03-26  0:25 ` Andrew Morton
  6 siblings, 0 replies; 14+ messages in thread
From: Nico Pache @ 2026-03-25 11:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: aarcange, akpm, anshuman.khandual, apopple, baohua, baolin.wang,
	byungchul, catalin.marinas, cl, corbet, dave.hansen, david,
	dev.jain, gourry, hannes, hughd, jackmanb, jack, jannh, jglisse,
	joshua.hahnjy, kas, lance.yang, Liam.Howlett, lorenzo.stoakes,
	mathieu.desnoyers, matthew.brost, mhiramat, mhocko, npache,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe, Lorenzo Stoakes (Oracle)

The khugepaged daemon and madvise_collapse have two different
implementations that do almost the same thing. Create collapse_single_pmd
to increase code reuse and create an entry point to these two users.

Refactor madvise_collapse and collapse_scan_mm_slot to use the new
collapse_single_pmd function. To help reduce confusion around the
mmap_locked variable, we rename mmap_locked to lock_dropped in the
collapse_scan_mm_slot() function, and remove the redundant mmap_locked
in madvise_collapse(); this further unifies the code readiblity. the
SCAN_PTE_MAPPED_HUGEPAGE enum is no longer reachable in the
madvise_collapse() function, so we drop it from the list of "continuing"
enums.

This introduces a minor behavioral change that is most likely an
undiscovered bug. The current implementation of khugepaged tests
collapse_test_exit_or_disable() before calling collapse_pte_mapped_thp,
but we weren't doing it in the madvise_collapse case. By unifying these
two callers madvise_collapse now also performs this check. We also modify
the return value to be SCAN_ANY_PROCESS which properly indicates that this
process is no longer valid to operate on.

By moving the madvise_collapse writeback-retry logic into the helper
function we can also avoid having to revalidate the VMA.

We guard the khugepaged_pages_collapsed variable to ensure its only
incremented for khugepaged.

As requested we also convert a VM_BUG_ON to a VM_WARN_ON.

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/khugepaged.c | 142 ++++++++++++++++++++++++------------------------
 1 file changed, 72 insertions(+), 70 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 3728a2cf133c..d06d84219e1b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1257,7 +1257,7 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long a
 
 static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
 		struct vm_area_struct *vma, unsigned long start_addr,
-		bool *mmap_locked, struct collapse_control *cc)
+		bool *lock_dropped, struct collapse_control *cc)
 {
 	pmd_t *pmd;
 	pte_t *pte, *_pte;
@@ -1432,7 +1432,7 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
 		result = collapse_huge_page(mm, start_addr, referenced,
 					    unmapped, cc);
 		/* collapse_huge_page will return with the mmap_lock released */
-		*mmap_locked = false;
+		*lock_dropped = true;
 	}
 out:
 	trace_mm_khugepaged_scan_pmd(mm, folio, referenced,
@@ -2424,6 +2424,67 @@ static enum scan_result collapse_scan_file(struct mm_struct *mm,
 	return result;
 }
 
+/*
+ * Try to collapse a single PMD starting at a PMD aligned addr, and return
+ * the results.
+ */
+static enum scan_result collapse_single_pmd(unsigned long addr,
+		struct vm_area_struct *vma, bool *lock_dropped,
+		struct collapse_control *cc)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	bool triggered_wb = false;
+	enum scan_result result;
+	struct file *file;
+	pgoff_t pgoff;
+
+	mmap_assert_locked(mm);
+
+	if (vma_is_anonymous(vma)) {
+		result = collapse_scan_pmd(mm, vma, addr, lock_dropped, cc);
+		goto end;
+	}
+
+	file = get_file(vma->vm_file);
+	pgoff = linear_page_index(vma, addr);
+
+	mmap_read_unlock(mm);
+	*lock_dropped = true;
+retry:
+	result = collapse_scan_file(mm, addr, file, pgoff, cc);
+
+	/*
+	 * For MADV_COLLAPSE, when encountering dirty pages, try to writeback,
+	 * then retry the collapse one time.
+	 */
+	if (!cc->is_khugepaged && result == SCAN_PAGE_DIRTY_OR_WRITEBACK &&
+	    !triggered_wb && mapping_can_writeback(file->f_mapping)) {
+		const loff_t lstart = (loff_t)pgoff << PAGE_SHIFT;
+		const loff_t lend = lstart + HPAGE_PMD_SIZE - 1;
+
+		filemap_write_and_wait_range(file->f_mapping, lstart, lend);
+		triggered_wb = true;
+		goto retry;
+	}
+	fput(file);
+
+	if (result == SCAN_PTE_MAPPED_HUGEPAGE) {
+		mmap_read_lock(mm);
+		if (collapse_test_exit_or_disable(mm))
+			result = SCAN_ANY_PROCESS;
+		else
+			result = try_collapse_pte_mapped_thp(mm, addr,
+							     !cc->is_khugepaged);
+		if (result == SCAN_PMD_MAPPED)
+			result = SCAN_SUCCEED;
+		mmap_read_unlock(mm);
+	}
+end:
+	if (cc->is_khugepaged && result == SCAN_SUCCEED)
+		++khugepaged_pages_collapsed;
+	return result;
+}
+
 static void collapse_scan_mm_slot(unsigned int progress_max,
 		enum scan_result *result, struct collapse_control *cc)
 	__releases(&khugepaged_mm_lock)
@@ -2485,46 +2546,21 @@ static void collapse_scan_mm_slot(unsigned int progress_max,
 		VM_BUG_ON(khugepaged_scan.address & ~HPAGE_PMD_MASK);
 
 		while (khugepaged_scan.address < hend) {
-			bool mmap_locked = true;
+			bool lock_dropped = false;
 
 			cond_resched();
 			if (unlikely(collapse_test_exit_or_disable(mm)))
 				goto breakouterloop;
 
-			VM_BUG_ON(khugepaged_scan.address < hstart ||
+			VM_WARN_ON_ONCE(khugepaged_scan.address < hstart ||
 				  khugepaged_scan.address + HPAGE_PMD_SIZE >
 				  hend);
-			if (!vma_is_anonymous(vma)) {
-				struct file *file = get_file(vma->vm_file);
-				pgoff_t pgoff = linear_page_index(vma,
-						khugepaged_scan.address);
-
-				mmap_read_unlock(mm);
-				mmap_locked = false;
-				*result = collapse_scan_file(mm,
-					khugepaged_scan.address, file, pgoff, cc);
-				fput(file);
-				if (*result == SCAN_PTE_MAPPED_HUGEPAGE) {
-					mmap_read_lock(mm);
-					if (collapse_test_exit_or_disable(mm))
-						goto breakouterloop;
-					*result = try_collapse_pte_mapped_thp(mm,
-						khugepaged_scan.address, false);
-					if (*result == SCAN_PMD_MAPPED)
-						*result = SCAN_SUCCEED;
-					mmap_read_unlock(mm);
-				}
-			} else {
-				*result = collapse_scan_pmd(mm, vma,
-					khugepaged_scan.address, &mmap_locked, cc);
-			}
-
-			if (*result == SCAN_SUCCEED)
-				++khugepaged_pages_collapsed;
 
+			*result = collapse_single_pmd(khugepaged_scan.address,
+						      vma, &lock_dropped, cc);
 			/* move to next address */
 			khugepaged_scan.address += HPAGE_PMD_SIZE;
-			if (!mmap_locked)
+			if (lock_dropped)
 				/*
 				 * We released mmap_lock so break loop.  Note
 				 * that we drop mmap_lock before all hugepage
@@ -2799,7 +2835,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 	unsigned long hstart, hend, addr;
 	enum scan_result last_fail = SCAN_FAIL;
 	int thps = 0;
-	bool mmap_locked = true;
 
 	BUG_ON(vma->vm_start > start);
 	BUG_ON(vma->vm_end < end);
@@ -2821,13 +2856,11 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 
 	for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
 		enum scan_result result = SCAN_FAIL;
-		bool triggered_wb = false;
 
-retry:
-		if (!mmap_locked) {
+		if (*lock_dropped) {
 			cond_resched();
 			mmap_read_lock(mm);
-			mmap_locked = true;
+			*lock_dropped = false;
 			result = hugepage_vma_revalidate(mm, addr, false, &vma,
 							 cc);
 			if (result  != SCAN_SUCCEED) {
@@ -2837,45 +2870,14 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 
 			hend = min(hend, vma->vm_end & HPAGE_PMD_MASK);
 		}
-		mmap_assert_locked(mm);
-		if (!vma_is_anonymous(vma)) {
-			struct file *file = get_file(vma->vm_file);
-			pgoff_t pgoff = linear_page_index(vma, addr);
 
-			mmap_read_unlock(mm);
-			mmap_locked = false;
-			*lock_dropped = true;
-			result = collapse_scan_file(mm, addr, file, pgoff, cc);
-
-			if (result == SCAN_PAGE_DIRTY_OR_WRITEBACK && !triggered_wb &&
-			    mapping_can_writeback(file->f_mapping)) {
-				loff_t lstart = (loff_t)pgoff << PAGE_SHIFT;
-				loff_t lend = lstart + HPAGE_PMD_SIZE - 1;
-
-				filemap_write_and_wait_range(file->f_mapping, lstart, lend);
-				triggered_wb = true;
-				fput(file);
-				goto retry;
-			}
-			fput(file);
-		} else {
-			result = collapse_scan_pmd(mm, vma, addr, &mmap_locked, cc);
-		}
-		if (!mmap_locked)
-			*lock_dropped = true;
+		result = collapse_single_pmd(addr, vma, lock_dropped, cc);
 
-handle_result:
 		switch (result) {
 		case SCAN_SUCCEED:
 		case SCAN_PMD_MAPPED:
 			++thps;
 			break;
-		case SCAN_PTE_MAPPED_HUGEPAGE:
-			BUG_ON(mmap_locked);
-			mmap_read_lock(mm);
-			result = try_collapse_pte_mapped_thp(mm, addr, true);
-			mmap_read_unlock(mm);
-			goto handle_result;
 		/* Whitelisted set of results where continuing OK */
 		case SCAN_NO_PTE_TABLE:
 		case SCAN_PTE_NON_PRESENT:
@@ -2898,7 +2900,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 
 out_maybelock:
 	/* Caller expects us to hold mmap_lock on return */
-	if (!mmap_locked)
+	if (*lock_dropped)
 		mmap_read_lock(mm);
 out_nolock:
 	mmap_assert_locked(mm);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
                   ` (4 preceding siblings ...)
  2026-03-25 11:40 ` [PATCH mm-unstable v4 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd() Nico Pache
@ 2026-03-25 11:44 ` Lorenzo Stoakes (Oracle)
  2026-03-26  0:25 ` Andrew Morton
  6 siblings, 0 replies; 14+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-25 11:44 UTC (permalink / raw)
  To: Nico Pache
  Cc: linux-kernel, linux-mm, aarcange, akpm, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, mathieu.desnoyers, matthew.brost, mhiramat, mhocko,
	peterx, pfalcato, rakie.kim, raquini, rdunlap, richard.weiyang,
	rientjes, rostedt, rppt, ryan.roberts, shivankg, sunnanyong,
	surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

-cc old email

Gentle reminder to please send new stuff to ljs@kernel.org!

Thanks, Lorenzo

On Wed, Mar 25, 2026 at 05:40:17AM -0600, Nico Pache wrote:
> MAINTAINER NOTE: This is based on mm-unstable with the coresponding
> patches reverted then reapplied.
>
> The following series contains cleanups and prerequisites for my work on
> khugepaged mTHP support [1]. These have been separated out to ease review.
>
> The first patch in the series refactors the page fault folio to pte mapping
> and follows a similar convention as defined by map_anon_folio_pmd_(no)pf().
> This not only cleans up the current implementation of do_anonymous_page(),
> but will allow for reuse later in the khugepaged mTHP implementation.
>
> The second patch adds a small is_pmd_order() helper to check if an order is
> the PMD order. This check is open-coded in a number of places. This patch
> aims to clean this up and will be used more in the khugepaged mTHP work.
> The third patch also adds a small DEFINE for (HPAGE_PMD_NR - 1) which is
> used often across the khugepaged code.
>
> The fourth and fifth patch come from the khugepaged mTHP patchset [1].
> These two patches include the rename of function prefixes, and the
> unification of khugepaged and madvise_collapse via a new
> collapse_single_pmd function.
>
> Patch 1:     refactor do_anonymous_page into map_anon_folio_pte_(no)pf
> Patch 2:     add is_pmd_order helper
> Patch 3:     Add define for (HPAGE_PMD_NR - 1)
> Patch 4:     Refactor/rename hpage_collapse
> Patch 5:     Refactoring to combine madvise_collapse and khugepaged
>
> Testing:
> - Built for x86_64, aarch64, ppc64le, and s390x
> - ran all arches on test suites provided by the kernel-tests project
> - selftests mm
>
> V4 Changes:
>  - added RB and SB tags
>  - Patch1: commit message cleanup/additions
>  - Patch1: constify two variables, and change 1<<order to 1L<<..
>  - Patch1: change zero-page read path to use update_mmu_cache varient
>  - Patch5: remove dead code switch statement (SCAN_PTE_MAPPED_HUGEPAGE)
>  - Patch5: remove local mmap_locked from madvise_collapse()
>  - Patch5: rename mmap_locked to lock_dropped in ..scan_mm_slot() and
>    invert the logic. the madvise|khugepaged code now share the same
>    naming convention across both functions.
>  - Patch5: add assertion to collapse_single_pmd() so both madvise_collapse
>    and khugepaged assert the lock.
>  - Patch5: Convert one of the VM_BUG_ON's to VM_WARN_ON
>
> V3 - https://lore.kernel.org/all/20260311211315.450947-1-npache@redhat.com
> V2 - https://lore.kernel.org/all/20260226012929.169479-1-npache@redhat.com
> V1 - https://lore.kernel.org/all/20260212021835.17755-1-npache@redhat.com
>
> A big thanks to everyone that has reviewed, tested, and participated in
> the development process.
>
> [1] - https://lore.kernel.org/all/20260122192841.128719-1-npache@redhat.com
> [2] - https://lore.kernel.org/all/7334b702-f6a0-4ccf-8ac6-8426a90d1846@kernel.org/
> [3] - https://lore.kernel.org/all/25723c0f-c702-44ad-93e9-1056313680cd@kernel.org/
> [4] - https://lore.kernel.org/all/81ff9caa-50f2-4951-8d82-2c8dcdf3db91@kernel.org/
>
> Nico Pache (5):
>   mm: consolidate anonymous folio PTE mapping into helpers
>   mm: introduce is_pmd_order helper
>   mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1
>   mm/khugepaged: rename hpage_collapse_* to collapse_*
>   mm/khugepaged: unify khugepaged and madv_collapse with
>     collapse_single_pmd()
>
>  include/linux/huge_mm.h |   5 +
>  include/linux/mm.h      |   4 +
>  mm/huge_memory.c        |   2 +-
>  mm/khugepaged.c         | 207 ++++++++++++++++++++--------------------
>  mm/memory.c             |  63 ++++++++----
>  mm/mempolicy.c          |   2 +-
>  mm/mremap.c             |   2 +-
>  mm/page_alloc.c         |   4 +-
>  mm/shmem.c              |   3 +-
>  9 files changed, 161 insertions(+), 131 deletions(-)
>
> --
> 2.53.0
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_*
  2026-03-25 11:40 ` [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
@ 2026-03-25 12:08   ` Lorenzo Stoakes (Oracle)
  0 siblings, 0 replies; 14+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-25 12:08 UTC (permalink / raw)
  To: Nico Pache
  Cc: linux-kernel, linux-mm, aarcange, akpm, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

On Wed, Mar 25, 2026 at 05:40:21AM -0600, Nico Pache wrote:
> The hpage_collapse functions describe functions used by madvise_collapse
> and khugepaged. remove the unnecessary hpage prefix to shorten the
> function name.
>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Nit, but could we please update this to:

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Thanks, Lorenzo

> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
>  mm/khugepaged.c | 60 ++++++++++++++++++++++++-------------------------
>  mm/mremap.c     |  2 +-
>  2 files changed, 30 insertions(+), 32 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index c9c17bfccf0d..3728a2cf133c 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -395,14 +395,14 @@ void __init khugepaged_destroy(void)
>  	kmem_cache_destroy(mm_slot_cache);
>  }
>
> -static inline int hpage_collapse_test_exit(struct mm_struct *mm)
> +static inline int collapse_test_exit(struct mm_struct *mm)
>  {
>  	return atomic_read(&mm->mm_users) == 0;
>  }
>
> -static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm)
> +static inline int collapse_test_exit_or_disable(struct mm_struct *mm)
>  {
> -	return hpage_collapse_test_exit(mm) ||
> +	return collapse_test_exit(mm) ||
>  		mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm);
>  }
>
> @@ -436,7 +436,7 @@ void __khugepaged_enter(struct mm_struct *mm)
>  	int wakeup;
>
>  	/* __khugepaged_exit() must not run from under us */
> -	VM_BUG_ON_MM(hpage_collapse_test_exit(mm), mm);
> +	VM_BUG_ON_MM(collapse_test_exit(mm), mm);
>  	if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
>  		return;
>
> @@ -490,7 +490,7 @@ void __khugepaged_exit(struct mm_struct *mm)
>  	} else if (slot) {
>  		/*
>  		 * This is required to serialize against
> -		 * hpage_collapse_test_exit() (which is guaranteed to run
> +		 * collapse_test_exit() (which is guaranteed to run
>  		 * under mmap sem read mode). Stop here (after we return all
>  		 * pagetables will be destroyed) until khugepaged has finished
>  		 * working on the pagetables under the mmap_lock.
> @@ -589,7 +589,7 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  			goto out;
>  		}
>
> -		/* See hpage_collapse_scan_pmd(). */
> +		/* See collapse_scan_pmd(). */
>  		if (folio_maybe_mapped_shared(folio)) {
>  			++shared;
>  			if (cc->is_khugepaged &&
> @@ -840,7 +840,7 @@ static struct collapse_control khugepaged_collapse_control = {
>  	.is_khugepaged = true,
>  };
>
> -static bool hpage_collapse_scan_abort(int nid, struct collapse_control *cc)
> +static bool collapse_scan_abort(int nid, struct collapse_control *cc)
>  {
>  	int i;
>
> @@ -875,7 +875,7 @@ static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void)
>  }
>
>  #ifdef CONFIG_NUMA
> -static int hpage_collapse_find_target_node(struct collapse_control *cc)
> +static int collapse_find_target_node(struct collapse_control *cc)
>  {
>  	int nid, target_node = 0, max_value = 0;
>
> @@ -894,7 +894,7 @@ static int hpage_collapse_find_target_node(struct collapse_control *cc)
>  	return target_node;
>  }
>  #else
> -static int hpage_collapse_find_target_node(struct collapse_control *cc)
> +static int collapse_find_target_node(struct collapse_control *cc)
>  {
>  	return 0;
>  }
> @@ -913,7 +913,7 @@ static enum scan_result hugepage_vma_revalidate(struct mm_struct *mm, unsigned l
>  	enum tva_type type = cc->is_khugepaged ? TVA_KHUGEPAGED :
>  				 TVA_FORCED_COLLAPSE;
>
> -	if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
> +	if (unlikely(collapse_test_exit_or_disable(mm)))
>  		return SCAN_ANY_PROCESS;
>
>  	*vmap = vma = find_vma(mm, address);
> @@ -984,7 +984,7 @@ static enum scan_result check_pmd_still_valid(struct mm_struct *mm,
>
>  /*
>   * Bring missing pages in from swap, to complete THP collapse.
> - * Only done if hpage_collapse_scan_pmd believes it is worthwhile.
> + * Only done if khugepaged_scan_pmd believes it is worthwhile.
>   *
>   * Called and returns without pte mapped or spinlocks held.
>   * Returns result: if not SCAN_SUCCEED, mmap_lock has been released.
> @@ -1070,7 +1070,7 @@ static enum scan_result alloc_charge_folio(struct folio **foliop, struct mm_stru
>  {
>  	gfp_t gfp = (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() :
>  		     GFP_TRANSHUGE);
> -	int node = hpage_collapse_find_target_node(cc);
> +	int node = collapse_find_target_node(cc);
>  	struct folio *folio;
>
>  	folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask);
> @@ -1255,7 +1255,7 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long a
>  	return result;
>  }
>
> -static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
> +static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
>  		struct vm_area_struct *vma, unsigned long start_addr,
>  		bool *mmap_locked, struct collapse_control *cc)
>  {
> @@ -1380,7 +1380,7 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
>  		 * hit record.
>  		 */
>  		node = folio_nid(folio);
> -		if (hpage_collapse_scan_abort(node, cc)) {
> +		if (collapse_scan_abort(node, cc)) {
>  			result = SCAN_SCAN_ABORT;
>  			goto out_unmap;
>  		}
> @@ -1446,7 +1446,7 @@ static void collect_mm_slot(struct mm_slot *slot)
>
>  	lockdep_assert_held(&khugepaged_mm_lock);
>
> -	if (hpage_collapse_test_exit(mm)) {
> +	if (collapse_test_exit(mm)) {
>  		/* free mm_slot */
>  		hash_del(&slot->hash);
>  		list_del(&slot->mm_node);
> @@ -1801,7 +1801,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>  		if (find_pmd_or_thp_or_none(mm, addr, &pmd) != SCAN_SUCCEED)
>  			continue;
>
> -		if (hpage_collapse_test_exit(mm))
> +		if (collapse_test_exit(mm))
>  			continue;
>
>  		if (!file_backed_vma_is_retractable(vma))
> @@ -2317,7 +2317,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
>  	return result;
>  }
>
> -static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
> +static enum scan_result collapse_scan_file(struct mm_struct *mm,
>  		unsigned long addr, struct file *file, pgoff_t start,
>  		struct collapse_control *cc)
>  {
> @@ -2370,7 +2370,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
>  		}
>
>  		node = folio_nid(folio);
> -		if (hpage_collapse_scan_abort(node, cc)) {
> +		if (collapse_scan_abort(node, cc)) {
>  			result = SCAN_SCAN_ABORT;
>  			folio_put(folio);
>  			break;
> @@ -2424,7 +2424,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
>  	return result;
>  }
>
> -static void khugepaged_scan_mm_slot(unsigned int progress_max,
> +static void collapse_scan_mm_slot(unsigned int progress_max,
>  		enum scan_result *result, struct collapse_control *cc)
>  	__releases(&khugepaged_mm_lock)
>  	__acquires(&khugepaged_mm_lock)
> @@ -2458,7 +2458,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>  		goto breakouterloop_mmap_lock;
>
>  	cc->progress++;
> -	if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
> +	if (unlikely(collapse_test_exit_or_disable(mm)))
>  		goto breakouterloop;
>
>  	vma_iter_init(&vmi, mm, khugepaged_scan.address);
> @@ -2466,7 +2466,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>  		unsigned long hstart, hend;
>
>  		cond_resched();
> -		if (unlikely(hpage_collapse_test_exit_or_disable(mm))) {
> +		if (unlikely(collapse_test_exit_or_disable(mm))) {
>  			cc->progress++;
>  			break;
>  		}
> @@ -2488,7 +2488,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>  			bool mmap_locked = true;
>
>  			cond_resched();
> -			if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
> +			if (unlikely(collapse_test_exit_or_disable(mm)))
>  				goto breakouterloop;
>
>  			VM_BUG_ON(khugepaged_scan.address < hstart ||
> @@ -2501,12 +2501,12 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>
>  				mmap_read_unlock(mm);
>  				mmap_locked = false;
> -				*result = hpage_collapse_scan_file(mm,
> +				*result = collapse_scan_file(mm,
>  					khugepaged_scan.address, file, pgoff, cc);
>  				fput(file);
>  				if (*result == SCAN_PTE_MAPPED_HUGEPAGE) {
>  					mmap_read_lock(mm);
> -					if (hpage_collapse_test_exit_or_disable(mm))
> +					if (collapse_test_exit_or_disable(mm))
>  						goto breakouterloop;
>  					*result = try_collapse_pte_mapped_thp(mm,
>  						khugepaged_scan.address, false);
> @@ -2515,7 +2515,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>  					mmap_read_unlock(mm);
>  				}
>  			} else {
> -				*result = hpage_collapse_scan_pmd(mm, vma,
> +				*result = collapse_scan_pmd(mm, vma,
>  					khugepaged_scan.address, &mmap_locked, cc);
>  			}
>
> @@ -2547,7 +2547,7 @@ static void khugepaged_scan_mm_slot(unsigned int progress_max,
>  	 * Release the current mm_slot if this mm is about to die, or
>  	 * if we scanned all vmas of this mm, or THP got disabled.
>  	 */
> -	if (hpage_collapse_test_exit_or_disable(mm) || !vma) {
> +	if (collapse_test_exit_or_disable(mm) || !vma) {
>  		/*
>  		 * Make sure that if mm_users is reaching zero while
>  		 * khugepaged runs here, khugepaged_exit will find
> @@ -2600,7 +2600,7 @@ static void khugepaged_do_scan(struct collapse_control *cc)
>  			pass_through_head++;
>  		if (khugepaged_has_work() &&
>  		    pass_through_head < 2)
> -			khugepaged_scan_mm_slot(progress_max, &result, cc);
> +			collapse_scan_mm_slot(progress_max, &result, cc);
>  		else
>  			cc->progress = progress_max;
>  		spin_unlock(&khugepaged_mm_lock);
> @@ -2845,8 +2845,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>  			mmap_read_unlock(mm);
>  			mmap_locked = false;
>  			*lock_dropped = true;
> -			result = hpage_collapse_scan_file(mm, addr, file, pgoff,
> -							  cc);
> +			result = collapse_scan_file(mm, addr, file, pgoff, cc);
>
>  			if (result == SCAN_PAGE_DIRTY_OR_WRITEBACK && !triggered_wb &&
>  			    mapping_can_writeback(file->f_mapping)) {
> @@ -2860,8 +2859,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>  			}
>  			fput(file);
>  		} else {
> -			result = hpage_collapse_scan_pmd(mm, vma, addr,
> -							 &mmap_locked, cc);
> +			result = collapse_scan_pmd(mm, vma, addr, &mmap_locked, cc);
>  		}
>  		if (!mmap_locked)
>  			*lock_dropped = true;
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 8566e32d58d9..e9c8b1d05832 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -244,7 +244,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  		goto out;
>  	}
>  	/*
> -	 * Now new_pte is none, so hpage_collapse_scan_file() path can not find
> +	 * Now new_pte is none, so collapse_scan_file() path can not find
>  	 * this by traversing file->f_mapping, so there is no concurrency with
>  	 * retract_page_tables(). In addition, we already hold the exclusive
>  	 * mmap_lock, so this new_pte page is stable, so there is no need to get
> --
> 2.53.0
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper
  2026-03-25 11:40 ` [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper Nico Pache
@ 2026-03-25 12:11   ` Lorenzo Stoakes (Oracle)
  2026-03-25 14:45     ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-25 12:11 UTC (permalink / raw)
  To: Nico Pache
  Cc: linux-kernel, linux-mm, aarcange, akpm, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

On Wed, Mar 25, 2026 at 05:40:19AM -0600, Nico Pache wrote:
> In order to add mTHP support to khugepaged, we will often be checking if a
> given order is (or is not) a PMD order. Some places in the kernel already
> use this check, so lets create a simple helper function to keep the code
> clean and readable.
>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Reviewed-by: Barry Song <baohua@kernel.org>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Pedro Falcato <pfalcato@suse.de>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Nit, but could we please update both to:

Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Thanks :), Lorenzo

> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
>  include/linux/huge_mm.h | 5 +++++
>  mm/huge_memory.c        | 2 +-
>  mm/khugepaged.c         | 6 +++---
>  mm/memory.c             | 2 +-
>  mm/mempolicy.c          | 2 +-
>  mm/page_alloc.c         | 4 ++--
>  mm/shmem.c              | 3 +--
>  7 files changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index c8799dca3b60..1258fa37e85b 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -769,6 +769,11 @@ static inline bool pmd_is_huge(pmd_t pmd)
>  }
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> +static inline bool is_pmd_order(unsigned int order)
> +{
> +	return order == HPAGE_PMD_ORDER;
> +}
> +
>  static inline int split_folio_to_list_to_order(struct folio *folio,
>  		struct list_head *list, int new_order)
>  {
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2833b06d7498..b2a6060b3c20 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -4118,7 +4118,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  		i_mmap_unlock_read(mapping);
>  out:
>  	xas_destroy(&xas);
> -	if (old_order == HPAGE_PMD_ORDER)
> +	if (is_pmd_order(old_order))
>  		count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
>  	count_mthp_stat(old_order, !ret ? MTHP_STAT_SPLIT : MTHP_STAT_SPLIT_FAILED);
>  	return ret;
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 6bd7a7c0632a..1f4609761294 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1547,7 +1547,7 @@ static enum scan_result try_collapse_pte_mapped_thp(struct mm_struct *mm, unsign
>  	if (IS_ERR(folio))
>  		return SCAN_PAGE_NULL;
>
> -	if (folio_order(folio) != HPAGE_PMD_ORDER) {
> +	if (!is_pmd_order(folio_order(folio))) {
>  		result = SCAN_PAGE_COMPOUND;
>  		goto drop_folio;
>  	}
> @@ -2030,7 +2030,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,
>  		 * we locked the first folio, then a THP might be there already.
>  		 * This will be discovered on the first iteration.
>  		 */
> -		if (folio_order(folio) == HPAGE_PMD_ORDER) {
> +		if (is_pmd_order(folio_order(folio))) {
>  			result = SCAN_PTE_MAPPED_HUGEPAGE;
>  			goto out_unlock;
>  		}
> @@ -2358,7 +2358,7 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm,
>  			continue;
>  		}
>
> -		if (folio_order(folio) == HPAGE_PMD_ORDER) {
> +		if (is_pmd_order(folio_order(folio))) {
>  			result = SCAN_PTE_MAPPED_HUGEPAGE;
>  			/*
>  			 * PMD-sized THP implies that we can only try
> diff --git a/mm/memory.c b/mm/memory.c
> index 6396d32c348a..e44469f9cf65 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5573,7 +5573,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct folio *folio, struct page *pa
>  	if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
>  		return ret;
>
> -	if (folio_order(folio) != HPAGE_PMD_ORDER)
> +	if (!is_pmd_order(folio_order(folio)))
>  		return ret;
>  	page = &folio->page;
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index ff52fb94ff27..fd08771e2057 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -2449,7 +2449,7 @@ static struct page *alloc_pages_mpol(gfp_t gfp, unsigned int order,
>
>  	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
>  	    /* filter "hugepage" allocation, unless from alloc_pages() */
> -	    order == HPAGE_PMD_ORDER && ilx != NO_INTERLEAVE_INDEX) {
> +	    is_pmd_order(order) && ilx != NO_INTERLEAVE_INDEX) {
>  		/*
>  		 * For hugepage allocation and non-interleave policy which
>  		 * allows the current node (or other explicitly preferred
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 915b6aef55d0..ee81f5c67c18 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -652,7 +652,7 @@ static inline unsigned int order_to_pindex(int migratetype, int order)
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	bool movable;
>  	if (order > PAGE_ALLOC_COSTLY_ORDER) {
> -		VM_BUG_ON(order != HPAGE_PMD_ORDER);
> +		VM_BUG_ON(!is_pmd_order(order));
>
>  		movable = migratetype == MIGRATE_MOVABLE;
>
> @@ -684,7 +684,7 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	if (order <= PAGE_ALLOC_COSTLY_ORDER)
>  		return true;
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -	if (order == HPAGE_PMD_ORDER)
> +	if (is_pmd_order(order))
>  		return true;
>  #endif
>  	return false;
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d00044257401..4ecefe02881d 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -5532,8 +5532,7 @@ static ssize_t thpsize_shmem_enabled_store(struct kobject *kobj,
>  		spin_unlock(&huge_shmem_orders_lock);
>  	} else if (sysfs_streq(buf, "inherit")) {
>  		/* Do not override huge allocation policy with non-PMD sized mTHP */
> -		if (shmem_huge == SHMEM_HUGE_FORCE &&
> -		    order != HPAGE_PMD_ORDER)
> +		if (shmem_huge == SHMEM_HUGE_FORCE && !is_pmd_order(order))
>  			return -EINVAL;
>
>  		spin_lock(&huge_shmem_orders_lock);
> --
> 2.53.0
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper
  2026-03-25 12:11   ` Lorenzo Stoakes (Oracle)
@ 2026-03-25 14:45     ` Andrew Morton
  2026-03-25 14:49       ` Lorenzo Stoakes (Oracle)
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2026-03-25 14:45 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle)
  Cc: Nico Pache, linux-kernel, linux-mm, aarcange, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

On Wed, 25 Mar 2026 12:11:43 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:

> On Wed, Mar 25, 2026 at 05:40:19AM -0600, Nico Pache wrote:
> > In order to add mTHP support to khugepaged, we will often be checking if a
> > given order is (or is not) a PMD order. Some places in the kernel already
> > use this check, so lets create a simple helper function to keep the code
> > clean and readable.
> >
> > Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > Reviewed-by: Dev Jain <dev.jain@arm.com>
> > Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> > Reviewed-by: Lance Yang <lance.yang@linux.dev>
> > Reviewed-by: Barry Song <baohua@kernel.org>
> > Reviewed-by: Zi Yan <ziy@nvidia.com>
> > Reviewed-by: Pedro Falcato <pfalcato@suse.de>
> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> Nit, but could we please update both to:
> 
> Lorenzo Stoakes (Oracle) <ljs@kernel.org>

hp2:/usr/src/25> grep lorenzo.stoakes@oracle.com patches/*.patch|wc -l
105

Please lmk if you'd prefer that I go through these and update the email
address.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper
  2026-03-25 14:45     ` Andrew Morton
@ 2026-03-25 14:49       ` Lorenzo Stoakes (Oracle)
  2026-03-25 16:05         ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-25 14:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nico Pache, linux-kernel, linux-mm, aarcange, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

On Wed, Mar 25, 2026 at 07:45:16AM -0700, Andrew Morton wrote:
> On Wed, 25 Mar 2026 12:11:43 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
>
> > On Wed, Mar 25, 2026 at 05:40:19AM -0600, Nico Pache wrote:
> > > In order to add mTHP support to khugepaged, we will often be checking if a
> > > given order is (or is not) a PMD order. Some places in the kernel already
> > > use this check, so lets create a simple helper function to keep the code
> > > clean and readable.
> > >
> > > Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> > > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > > Reviewed-by: Dev Jain <dev.jain@arm.com>
> > > Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> > > Reviewed-by: Lance Yang <lance.yang@linux.dev>
> > > Reviewed-by: Barry Song <baohua@kernel.org>
> > > Reviewed-by: Zi Yan <ziy@nvidia.com>
> > > Reviewed-by: Pedro Falcato <pfalcato@suse.de>
> > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> >
> > Nit, but could we please update both to:
> >
> > Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> hp2:/usr/src/25> grep lorenzo.stoakes@oracle.com patches/*.patch|wc -l
> 105
>
> Please lmk if you'd prefer that I go through these and update the email
> address.

That'd be useful thanks.

It's not a huge big deal, but trying to gently nudge the transition forwards a
bit...

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper
  2026-03-25 14:49       ` Lorenzo Stoakes (Oracle)
@ 2026-03-25 16:05         ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2026-03-25 16:05 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle)
  Cc: Nico Pache, linux-kernel, linux-mm, aarcange, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

On Wed, 25 Mar 2026 14:49:30 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:

> On Wed, Mar 25, 2026 at 07:45:16AM -0700, Andrew Morton wrote:
> > On Wed, 25 Mar 2026 12:11:43 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> >
> > > On Wed, Mar 25, 2026 at 05:40:19AM -0600, Nico Pache wrote:
> > > > In order to add mTHP support to khugepaged, we will often be checking if a
> > > > given order is (or is not) a PMD order. Some places in the kernel already
> > > > use this check, so lets create a simple helper function to keep the code
> > > > clean and readable.
> > > >
> > > > Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> > > > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > > > Reviewed-by: Dev Jain <dev.jain@arm.com>
> > > > Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> > > > Reviewed-by: Lance Yang <lance.yang@linux.dev>
> > > > Reviewed-by: Barry Song <baohua@kernel.org>
> > > > Reviewed-by: Zi Yan <ziy@nvidia.com>
> > > > Reviewed-by: Pedro Falcato <pfalcato@suse.de>
> > > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > > Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > >
> > > Nit, but could we please update both to:
> > >
> > > Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> >
> > hp2:/usr/src/25> grep lorenzo.stoakes@oracle.com patches/*.patch|wc -l
> > 105
> >
> > Please lmk if you'd prefer that I go through these and update the email
> > address.
> 
> That'd be useful thanks.
> 
> It's not a huge big deal, but trying to gently nudge the transition forwards a
> bit...

Done.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites
  2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
                   ` (5 preceding siblings ...)
  2026-03-25 11:44 ` [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Lorenzo Stoakes (Oracle)
@ 2026-03-26  0:25 ` Andrew Morton
  2026-03-26  4:44   ` Roman Gushchin
  6 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2026-03-26  0:25 UTC (permalink / raw)
  To: Nico Pache
  Cc: linux-kernel, linux-mm, aarcange, anshuman.khandual, apopple,
	baohua, baolin.wang, byungchul, catalin.marinas, cl, corbet,
	dave.hansen, david, dev.jain, gourry, hannes, hughd, jackmanb,
	jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe, Roman Gushchin

On Wed, 25 Mar 2026 05:40:17 -0600 Nico Pache <npache@redhat.com> wrote:

> MAINTAINER NOTE: This is based on mm-unstable with the coresponding
> patches reverted then reapplied.

Unfortunately the update-in-place trick fooled AI review, which might
have been useful.  Oh well.  In retrospect we could have avoided that
by you asking me to drop v3 a couple of days before mailing out v4.

otoh, this series *does* apply to the mm-stable branch.  Roman, I
though Sashiko is attempting that?

> The following series contains cleanups and prerequisites for my work on
> khugepaged mTHP support [1]. These have been separated out to ease review.

And boy that's a lot of reviewers!  Aren't you a lucky ducky ;)

> The first patch in the series refactors the page fault folio to pte mapping
> and follows a similar convention as defined by map_anon_folio_pmd_(no)pf().
> This not only cleans up the current implementation of do_anonymous_page(),
> but will allow for reuse later in the khugepaged mTHP implementation.
> 
> The second patch adds a small is_pmd_order() helper to check if an order is
> the PMD order. This check is open-coded in a number of places. This patch
> aims to clean this up and will be used more in the khugepaged mTHP work.
> The third patch also adds a small DEFINE for (HPAGE_PMD_NR - 1) which is
> used often across the khugepaged code.
> 
> The fourth and fifth patch come from the khugepaged mTHP patchset [1].
> These two patches include the rename of function prefixes, and the
> unification of khugepaged and madvise_collapse via a new
> collapse_single_pmd function.
> 
> Patch 1:     refactor do_anonymous_page into map_anon_folio_pte_(no)pf
> Patch 2:     add is_pmd_order helper
> Patch 3:     Add define for (HPAGE_PMD_NR - 1)
> Patch 4:     Refactor/rename hpage_collapse
> Patch 5:     Refactoring to combine madvise_collapse and khugepaged
> 

Thanks, I updated mm.git's mm-unstable branch to this version.

> V4 Changes:
>  - added RB and SB tags
>  - Patch1: commit message cleanup/additions
>  - Patch1: constify two variables, and change 1<<order to 1L<<..
>  - Patch1: change zero-page read path to use update_mmu_cache varient
>  - Patch5: remove dead code switch statement (SCAN_PTE_MAPPED_HUGEPAGE)
>  - Patch5: remove local mmap_locked from madvise_collapse()
>  - Patch5: rename mmap_locked to lock_dropped in ..scan_mm_slot() and
>    invert the logic. the madvise|khugepaged code now share the same 
>    naming convention across both functions.
>  - Patch5: add assertion to collapse_single_pmd() so both madvise_collapse
>    and khugepaged assert the lock.
>  - Patch5: Convert one of the VM_BUG_ON's to VM_WARN_ON

Below is how v4 altered mm,git:

 mm/khugepaged.c |   34 +++++++++++++++-------------------
 mm/memory.c     |   11 +++++------
 2 files changed, 20 insertions(+), 25 deletions(-)

--- a/mm/khugepaged.c~b
+++ a/mm/khugepaged.c
@@ -1250,7 +1250,7 @@ out_nolock:
 
 static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
 		struct vm_area_struct *vma, unsigned long start_addr,
-		bool *mmap_locked, struct collapse_control *cc)
+		bool *lock_dropped, struct collapse_control *cc)
 {
 	pmd_t *pmd;
 	pte_t *pte, *_pte;
@@ -1425,7 +1425,7 @@ out_unmap:
 		result = collapse_huge_page(mm, start_addr, referenced,
 					    unmapped, cc);
 		/* collapse_huge_page will return with the mmap_lock released */
-		*mmap_locked = false;
+		*lock_dropped = true;
 	}
 out:
 	trace_mm_khugepaged_scan_pmd(mm, folio, referenced,
@@ -2422,7 +2422,7 @@ static enum scan_result collapse_scan_fi
  * the results.
  */
 static enum scan_result collapse_single_pmd(unsigned long addr,
-		struct vm_area_struct *vma, bool *mmap_locked,
+		struct vm_area_struct *vma, bool *lock_dropped,
 		struct collapse_control *cc)
 {
 	struct mm_struct *mm = vma->vm_mm;
@@ -2431,8 +2431,10 @@ static enum scan_result collapse_single_
 	struct file *file;
 	pgoff_t pgoff;
 
+	mmap_assert_locked(mm);
+
 	if (vma_is_anonymous(vma)) {
-		result = collapse_scan_pmd(mm, vma, addr, mmap_locked, cc);
+		result = collapse_scan_pmd(mm, vma, addr, lock_dropped, cc);
 		goto end;
 	}
 
@@ -2440,7 +2442,7 @@ static enum scan_result collapse_single_
 	pgoff = linear_page_index(vma, addr);
 
 	mmap_read_unlock(mm);
-	*mmap_locked = false;
+	*lock_dropped = true;
 retry:
 	result = collapse_scan_file(mm, addr, file, pgoff, cc);
 
@@ -2537,21 +2539,21 @@ static void collapse_scan_mm_slot(unsign
 		VM_BUG_ON(khugepaged_scan.address & ~HPAGE_PMD_MASK);
 
 		while (khugepaged_scan.address < hend) {
-			bool mmap_locked = true;
+			bool lock_dropped = false;
 
 			cond_resched();
 			if (unlikely(collapse_test_exit_or_disable(mm)))
 				goto breakouterloop;
 
-			VM_BUG_ON(khugepaged_scan.address < hstart ||
+			VM_WARN_ON_ONCE(khugepaged_scan.address < hstart ||
 				  khugepaged_scan.address + HPAGE_PMD_SIZE >
 				  hend);
 
 			*result = collapse_single_pmd(khugepaged_scan.address,
-						      vma, &mmap_locked, cc);
+						      vma, &lock_dropped, cc);
 			/* move to next address */
 			khugepaged_scan.address += HPAGE_PMD_SIZE;
-			if (!mmap_locked)
+			if (lock_dropped)
 				/*
 				 * We released mmap_lock so break loop.  Note
 				 * that we drop mmap_lock before all hugepage
@@ -2826,7 +2828,6 @@ int madvise_collapse(struct vm_area_stru
 	unsigned long hstart, hend, addr;
 	enum scan_result last_fail = SCAN_FAIL;
 	int thps = 0;
-	bool mmap_locked = true;
 
 	BUG_ON(vma->vm_start > start);
 	BUG_ON(vma->vm_end < end);
@@ -2849,10 +2850,10 @@ int madvise_collapse(struct vm_area_stru
 	for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
 		enum scan_result result = SCAN_FAIL;
 
-		if (!mmap_locked) {
+		if (*lock_dropped) {
 			cond_resched();
 			mmap_read_lock(mm);
-			mmap_locked = true;
+			*lock_dropped = false;
 			result = hugepage_vma_revalidate(mm, addr, false, &vma,
 							 cc);
 			if (result  != SCAN_SUCCEED) {
@@ -2862,12 +2863,8 @@ int madvise_collapse(struct vm_area_stru
 
 			hend = min(hend, vma->vm_end & HPAGE_PMD_MASK);
 		}
-		mmap_assert_locked(mm);
-
-		result = collapse_single_pmd(addr, vma, &mmap_locked, cc);
 
-		if (!mmap_locked)
-			*lock_dropped = true;
+		result = collapse_single_pmd(addr, vma, lock_dropped, cc);
 
 		switch (result) {
 		case SCAN_SUCCEED:
@@ -2876,7 +2873,6 @@ int madvise_collapse(struct vm_area_stru
 			break;
 		/* Whitelisted set of results where continuing OK */
 		case SCAN_NO_PTE_TABLE:
-		case SCAN_PTE_MAPPED_HUGEPAGE:
 		case SCAN_PTE_NON_PRESENT:
 		case SCAN_PTE_UFFD_WP:
 		case SCAN_LACK_REFERENCED_PAGE:
@@ -2897,7 +2893,7 @@ int madvise_collapse(struct vm_area_stru
 
 out_maybelock:
 	/* Caller expects us to hold mmap_lock on return */
-	if (!mmap_locked)
+	if (*lock_dropped)
 		mmap_read_lock(mm);
 out_nolock:
 	mmap_assert_locked(mm);
--- a/mm/memory.c~b
+++ a/mm/memory.c
@@ -5201,7 +5201,7 @@ void map_anon_folio_pte_nopf(struct foli
 		struct vm_area_struct *vma, unsigned long addr,
 		bool uffd_wp)
 {
-	unsigned int nr_pages = folio_nr_pages(folio);
+	const unsigned int nr_pages = folio_nr_pages(folio);
 	pte_t entry = folio_mk_pte(folio, vma->vm_page_prot);
 
 	entry = pte_sw_mkyoung(entry);
@@ -5221,10 +5221,10 @@ void map_anon_folio_pte_nopf(struct foli
 static void map_anon_folio_pte_pf(struct folio *folio, pte_t *pte,
 		struct vm_area_struct *vma, unsigned long addr, bool uffd_wp)
 {
-	unsigned int order = folio_order(folio);
+	const unsigned int order = folio_order(folio);
 
 	map_anon_folio_pte_nopf(folio, pte, vma, addr, uffd_wp);
-	add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1 << order);
+	add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1L << order);
 	count_mthp_stat(order, MTHP_STAT_ANON_FAULT_ALLOC);
 }
 
@@ -5239,7 +5239,7 @@ static vm_fault_t do_anonymous_page(stru
 	unsigned long addr = vmf->address;
 	struct folio *folio;
 	vm_fault_t ret = 0;
-	int nr_pages = 1;
+	int nr_pages;
 	pte_t entry;
 
 	/* File mapping without ->vm_ops ? */
@@ -5279,8 +5279,7 @@ static vm_fault_t do_anonymous_page(stru
 		set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
 
 		/* No need to invalidate - it was non-present before */
-		update_mmu_cache_range(vmf, vma, addr, vmf->pte,
-				       /*nr_pages=*/ 1);
+		update_mmu_cache(vma, addr, vmf->pte);
 		goto unlock;
 	}
 
_



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites
  2026-03-26  0:25 ` Andrew Morton
@ 2026-03-26  4:44   ` Roman Gushchin
  0 siblings, 0 replies; 14+ messages in thread
From: Roman Gushchin @ 2026-03-26  4:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nico Pache, linux-kernel, linux-mm, aarcange, anshuman.khandual,
	apopple, baohua, baolin.wang, byungchul, catalin.marinas, cl,
	corbet, dave.hansen, david, dev.jain, gourry, hannes, hughd,
	jackmanb, jack, jannh, jglisse, joshua.hahnjy, kas, lance.yang,
	Liam.Howlett, lorenzo.stoakes, mathieu.desnoyers, matthew.brost,
	mhiramat, mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe

Andrew Morton <akpm@linux-foundation.org> writes:

> On Wed, 25 Mar 2026 05:40:17 -0600 Nico Pache <npache@redhat.com> wrote:
>
>> MAINTAINER NOTE: This is based on mm-unstable with the coresponding
>> patches reverted then reapplied.
>
> Unfortunately the update-in-place trick fooled AI review, which might
> have been useful.  Oh well.  In retrospect we could have avoided that
> by you asking me to drop v3 a couple of days before mailing out v4.
>
> otoh, this series *does* apply to the mm-stable branch.  Roman, I
> though Sashiko is attempting that?

It did, but for some reason it failed.
You can actually see which trees/branches it tried under the Baseline
section for each patchset. And yes, I need to improve it to log specific
sha's, not only something like mm/mm-new. Now it's kinda hard to say why
it failed.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-03-26  4:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25 11:40 [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
2026-03-25 11:40 ` [PATCH mm-unstable v4 1/5] mm: consolidate anonymous folio PTE mapping into helpers Nico Pache
2026-03-25 11:40 ` [PATCH mm-unstable v4 2/5] mm: introduce is_pmd_order helper Nico Pache
2026-03-25 12:11   ` Lorenzo Stoakes (Oracle)
2026-03-25 14:45     ` Andrew Morton
2026-03-25 14:49       ` Lorenzo Stoakes (Oracle)
2026-03-25 16:05         ` Andrew Morton
2026-03-25 11:40 ` [PATCH mm-unstable v4 3/5] mm/khugepaged: define KHUGEPAGED_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1 Nico Pache
2026-03-25 11:40 ` [PATCH mm-unstable v4 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2026-03-25 12:08   ` Lorenzo Stoakes (Oracle)
2026-03-25 11:40 ` [PATCH mm-unstable v4 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd() Nico Pache
2026-03-25 11:44 ` [PATCH mm-unstable v4 0/5] mm: khugepaged cleanups and mTHP prerequisites Lorenzo Stoakes (Oracle)
2026-03-26  0:25 ` Andrew Morton
2026-03-26  4:44   ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox