From: Andrew Morton <akpm@linux-foundation.org>
To: Nico Pache <npache@redhat.com>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
aarcange@redhat.com, anshuman.khandual@arm.com,
apopple@nvidia.com, baohua@kernel.org,
baolin.wang@linux.alibaba.com, byungchul@sk.com,
catalin.marinas@arm.com, cl@gentwo.org, corbet@lwn.net,
dave.hansen@linux.intel.com, david@kernel.org, dev.jain@arm.com,
gourry@gourry.net, hannes@cmpxchg.org, hughd@google.com,
jack@suse.cz, jackmanb@google.com, jannh@google.com,
jglisse@google.com, joshua.hahnjy@gmail.com, kas@kernel.org,
lance.yang@linux.dev, liam@infradead.org, ljs@kernel.org,
mathieu.desnoyers@efficios.com, matthew.brost@intel.com,
mhiramat@kernel.org, mhocko@suse.com, peterx@redhat.com,
pfalcato@suse.de, rakie.kim@sk.com, raquini@redhat.com,
rdunlap@infradead.org, richard.weiyang@gmail.com,
rientjes@google.com, rostedt@goodmis.org, rppt@kernel.org,
ryan.roberts@arm.com, shivankg@amd.com, sunnanyong@huawei.com,
surenb@google.com, thomas.hellstrom@linux.intel.com,
tiwai@suse.de, usamaarif642@gmail.com, vbabka@suse.cz,
vishal.moola@gmail.com, wangkefeng.wang@huawei.com,
will@kernel.org, willy@infradead.org,
yang@os.amperecomputing.com, ying.huang@linux.alibaba.com,
ziy@nvidia.com, zokeefe@google.com
Subject: Re: [PATCH mm-unstable v17 00/14] khugepaged: mTHP support
Date: Mon, 11 May 2026 14:04:40 -0700 [thread overview]
Message-ID: <20260511140440.d8a71b4774d13537b3977d19@linux-foundation.org> (raw)
In-Reply-To: <20260511185817.686831-1-npache@redhat.com>
On Mon, 11 May 2026 12:58:00 -0600 Nico Pache <npache@redhat.com> wrote:
> The following series provides khugepaged with the capability to collapse
> anonymous memory regions to mTHPs.
Thanks, I've updated mm.git's mm-new branch to this version.
> V17 Changes:
> - Added Acks/RB
> - New patch(5): split the mmap_read_unlock() locking contract change out of
> "generalize collapse_huge_page" into its own patch; add a comment
> documenting the enter/exit-with-lock-dropped contract (Usama, David)
> - [patch 03] Add const to max_ptes_none/shared/swap variables; improve the
> three helper docstrings; replace the paragraphs with inline comments;
> note that sysctl values are now snapshotted once per scan (Usama, David)
> - [patch 04] Add SCAN_INVALID_PTES_NONE result code and return it instead
> of SCAN_FAIL when collapse_max_ptes_none() returns -EINVAL (Usama);
> snapshot khugepaged_max_ptes_none into a local variable to fix race on
> the two comparisons (Usama); clean up mTHP docstring paragraphs into
> inline comments; fix commit message wording (David)
> - [patch 06] Remove /* PMD collapse */ and /* mTHP collapse */ comments
> (David); move const declarations to top of variable list (David); add
> comment explaining that map_anon_folio_pte_nopf() calls set_ptes under
> pmd_ptl and is safe because PMD is expected to be none (Usama)
> - [patch 08] Shorten sysfs counter documentation for
> collapse_exceed_swap/shared_pte to concise one-liners; trim
> collapse_exceed_none_pte description; fix "dont" → "do not" (David)
> - [patch 10] Keep vm_flags parameter in khugepaged_enter_vma() and
> collapse_allowable_orders() rather than dropping it and reading
> vma->vm_flags internally; pass vm_flags explicitly at all three
> collapse_allowable_orders() call sites (David, sashskio)
> - [patch 11] Fix MTHP_STACK_SIZE: was exponential (~128); correct formula
> is (height + 1) for a DFS on a binary tree rewrite comment to explain
> the DFS sizing (sashskio)
> - [patch 12] Replace SCAN_PAGE_LRU with SCAN_PAGE_LAZYFREE in the
> "goto next_order" early-bail cases; non-LRU page failures cannot be
> recovered at any order and belong in the default (return) path
> - [patch 13] Use tva_flags == TVA_KHUGEPAGED (strict equality) instead of
> tva_flags & TVA_KHUGEPAGED; flatten nested if into single condition;
> retain vm_flags parameter; pass vm_flags to collapse_allowable_orders()
Here's how v17 altered mm.git:
Documentation/admin-guide/mm/transhuge.rst | 24 ---
include/linux/khugepaged.h | 6
include/trace/events/huge_memory.h | 3
mm/huge_memory.c | 2
mm/khugepaged.c | 152 ++++++++++---------
mm/vma.c | 6
tools/testing/vma/include/stubs.h | 3
7 files changed, 99 insertions(+), 97 deletions(-)
--- a/Documentation/admin-guide/mm/transhuge.rst~b
+++ a/Documentation/admin-guide/mm/transhuge.rst
@@ -725,27 +725,17 @@ nr_anon_partially_mapped
collapse_exceed_none_pte
The number of collapse attempts that failed due to exceeding the
- max_ptes_none threshold. For mTHP collapse, Currently only max_ptes_none
- values of 0 and (HPAGE_PMD_NR - 1) are supported. Any other value will
- emit a warning and no mTHP collapse will be attempted. khugepaged will
- try to collapse to the largest enabled (m)THP size; if it fails, it will
- try the next lower enabled mTHP size. This counter records the number of
- times a collapse attempt was skipped for exceeding the max_ptes_none
- threshold, and khugepaged will move on to the next available mTHP size.
+ max_ptes_none threshold.
collapse_exceed_swap_pte
- The number of anonymous mTHP PTE ranges which were unable to collapse due
- to containing at least one swap PTE. Currently khugepaged does not
- support collapsing mTHP regions that contain a swap PTE. This counter can
- be used to monitor the number of khugepaged mTHP collapses that failed
- due to the presence of a swap PTE.
+ The number of collapse attempts that failed due to exceeding the
+ max_ptes_swap threshold. For non-PMD orders this occurs if a mTHP range
+ contains at least one swap PTE.
collapse_exceed_shared_pte
- The number of anonymous mTHP PTE ranges which were unable to collapse due
- to containing at least one shared PTE. Currently khugepaged does not
- support collapsing mTHP PTE ranges that contain a shared PTE. This
- counter can be used to monitor the number of khugepaged mTHP collapses
- that failed due to the presence of a shared PTE.
+ The number of collapse attempts that failed due to exceeding the
+ max_ptes_shared threshold. For non-PMD orders this occurs if a mTHP range
+ contains at least one shared PTE.
As the system ages, allocating huge pages may be expensive as the
system uses memory compaction to copy data around memory to free a
--- a/include/linux/khugepaged.h~b
+++ a/include/linux/khugepaged.h
@@ -13,7 +13,8 @@ extern void khugepaged_destroy(void);
extern int start_stop_khugepaged(void);
extern void __khugepaged_enter(struct mm_struct *mm);
extern void __khugepaged_exit(struct mm_struct *mm);
-extern void khugepaged_enter_vma(struct vm_area_struct *vma);
+extern void khugepaged_enter_vma(struct vm_area_struct *vma,
+ vm_flags_t vm_flags);
extern void khugepaged_min_free_kbytes_update(void);
extern bool current_is_khugepaged(void);
void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
@@ -37,7 +38,8 @@ static inline void khugepaged_fork(struc
static inline void khugepaged_exit(struct mm_struct *mm)
{
}
-static inline void khugepaged_enter_vma(struct vm_area_struct *vma)
+static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
+ vm_flags_t vm_flags)
{
}
static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
--- a/include/trace/events/huge_memory.h~b
+++ a/include/trace/events/huge_memory.h
@@ -39,7 +39,8 @@
EM( SCAN_STORE_FAILED, "store_failed") \
EM( SCAN_COPY_MC, "copy_poisoned_page") \
EM( SCAN_PAGE_FILLED, "page_filled") \
- EMe(SCAN_PAGE_DIRTY_OR_WRITEBACK, "page_dirty_or_writeback")
+ EM(SCAN_PAGE_DIRTY_OR_WRITEBACK, "page_dirty_or_writeback") \
+ EMe(SCAN_INVALID_PTES_NONE, "invalid_ptes_none")
#undef EM
#undef EMe
--- a/mm/huge_memory.c~b
+++ a/mm/huge_memory.c
@@ -1571,7 +1571,7 @@ vm_fault_t do_huge_pmd_anonymous_page(st
ret = vmf_anon_prepare(vmf);
if (ret)
return ret;
- khugepaged_enter_vma(vma);
+ khugepaged_enter_vma(vma, vma->vm_flags);
if (!(vmf->flags & FAULT_FLAG_WRITE) &&
!mm_forbids_zeropage(vma->vm_mm) &&
--- a/mm/khugepaged.c~b
+++ a/mm/khugepaged.c
@@ -61,6 +61,7 @@ enum scan_result {
SCAN_COPY_MC,
SCAN_PAGE_FILLED,
SCAN_PAGE_DIRTY_OR_WRITEBACK,
+ SCAN_INVALID_PTES_NONE,
};
#define CREATE_TRACE_POINTS
@@ -101,16 +102,15 @@ static struct kmem_cache *mm_slot_cache
#define KHUGEPAGED_MIN_MTHP_ORDER 2
/*
- * The maximum number of mTHP ranges that can be stored on the stack.
- * This is calculated based on the number of PTE entries in a PTE page table
- * and the minimum mTHP order.
+ * mthp_collapse() does an iterative DFS over a binary tree, from
+ * HPAGE_PMD_ORDER down to KHUGEPAGED_MIN_MTHP_ORDER. The max stack
+ * size needed for a DFS on a binary tree is height + 1, where
+ * height = HPAGE_PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER.
*
- * ilog2 is needed in place of HPAGE_PMD_ORDER due to some architectures
- * (ie ppc64le) not defining HPAGE_PMD_ORDER until after build time.
- *
- * At most there will be 1 << (PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER) mTHP ranges
+ * ilog2 is used in place of HPAGE_PMD_ORDER because some architectures
+ * (e.g. ppc64le) do not define HPAGE_PMD_ORDER until after build time.
*/
-#define MTHP_STACK_SIZE (1UL << (ilog2(MAX_PTRS_PER_PTE) - KHUGEPAGED_MIN_MTHP_ORDER))
+#define MTHP_STACK_SIZE (ilog2(MAX_PTRS_PER_PTE) - KHUGEPAGED_MIN_MTHP_ORDER + 1)
/*
* Defines a range of PTE entries in a PTE page table which are being
@@ -380,89 +380,87 @@ static bool pte_none_or_zero(pte_t pte)
}
/**
- * collapse_max_ptes_none - Calculate maximum allowed empty PTEs for collapse
+ * collapse_max_ptes_none - Calculate maximum allowed none-page or zero-page
+ * PTEs for the given collapse operation.
* @cc: The collapse control struct
* @vma: The vma to check for userfaultfd
* @order: The folio order being collapsed to
*
- * If we are not in khugepaged mode use HPAGE_PMD_NR to allow any
- * empty page. For PMD-sized collapses (order == HPAGE_PMD_ORDER), use the
- * configured khugepaged_max_ptes_none value.
- *
- * For mTHP collapses, we currently only support khugepaged_max_pte_none values
- * of 0 or (KHUGEPAGED_MAX_PTES_LIMIT). Any other value will emit a warning and
- * no mTHP collapse will be attempted
- *
- * Return: Maximum number of empty PTEs allowed for the collapse operation
+ * Return: Maximum number of none-page or zero-page PTEs allowed for the
+ * collapse operation.
*/
static int collapse_max_ptes_none(struct collapse_control *cc,
struct vm_area_struct *vma, unsigned int order)
{
+ unsigned int max_ptes_none = khugepaged_max_ptes_none;
+ // If the vma is userfaultfd-armed, allow no none-page or zero-page PTEs.
if (vma && userfaultfd_armed(vma))
return 0;
+ // for MADV_COLLAPSE, allow any none-page or zero-page PTEs.
if (!cc->is_khugepaged)
return HPAGE_PMD_NR;
+ // for PMD collapse, respect the user defined maximum.
if (is_pmd_order(order))
- return khugepaged_max_ptes_none;
+ return max_ptes_none;
/* Zero/non-present collapse disabled. */
- if (!khugepaged_max_ptes_none)
+ if (!max_ptes_none)
return 0;
- if (khugepaged_max_ptes_none == KHUGEPAGED_MAX_PTES_LIMIT)
+ // for mTHP collapse with the sysctl value set to KHUGEPAGED_MAX_PTES_LIMIT,
+ // scale the maximum number of PTEs to the order of the collapse.
+ if (max_ptes_none == KHUGEPAGED_MAX_PTES_LIMIT)
return (1 << order) - 1;
+ // We currently only support max_ptes_none values of 0 or KHUGEPAGED_MAX_PTES_LIMIT.
+ // Emit a warning and return -EINVAL.
pr_warn_once("mTHP collapse only supports max_ptes_none values of 0 or %u\n",
KHUGEPAGED_MAX_PTES_LIMIT);
return -EINVAL;
}
/**
- * collapse_max_ptes_shared - Calculate maximum allowed shared PTEs for collapse
+ * collapse_max_ptes_shared - Calculate maximum allowed PTEs that map shared
+ * anonymous pages for the given collapse operation.
* @cc: The collapse control struct
* @order: The folio order being collapsed to
*
- * If we are not in khugepaged mode use HPAGE_PMD_NR to allow any
- * shared page.
- *
- * For mTHP collapses, we currently dont support collapsing memory with
- * shared memory.
- *
- * Return: Maximum number of shared PTEs allowed for the collapse operation
+ * Return: Maximum number of PTEs that map shared anonymous pages for the
+ * collapse operation
*/
static unsigned int collapse_max_ptes_shared(struct collapse_control *cc,
unsigned int order)
{
+ // for MADV_COLLAPSE, do not restrict the number of PTEs that map shared
+ // anonymous pages.
if (!cc->is_khugepaged)
return HPAGE_PMD_NR;
+ // for mTHP collapse do not allow collapsing anonymous memory pages that
+ // are shared between processes.
if (!is_pmd_order(order))
return 0;
-
+ // for PMD collapse, respect the user defined maximum.
return khugepaged_max_ptes_shared;
}
/**
- * collapse_max_ptes_swap - Calculate maximum allowed swap PTEs for collapse
+ * collapse_max_ptes_swap - Calculate the maximum allowed non-present PTEs or the
+ * maximum allowed non-present pagecache entries for the given collapse operation.
* @cc: The collapse control struct
* @order: The folio order being collapsed to
*
- * If we are not in khugepaged mode use HPAGE_PMD_NR to allow any
- * swap page.
- *
- * For PMD-sized collapses (order == HPAGE_PMD_ORDER), use the configured
- * khugepaged_max_ptes_swap value.
- *
- * For mTHP collapses, we currently dont support collapsing memory with
- * swapped out memory.
- *
- * Return: Maximum number of swap PTEs allowed for the collapse operation
+ * Return: Maximum number of non-present PTEs or the maximum allowed non-present
+ * pagecache entries for the collapse operation.
*/
static unsigned int collapse_max_ptes_swap(struct collapse_control *cc,
unsigned int order)
{
+ // for MADV_COLLAPSE, do not restrict the number PTEs entries or
+ // pagecache entries that are non-present.
if (!cc->is_khugepaged)
return HPAGE_PMD_NR;
+ // for mTHP collapse do not allow any non-present PTEs or pagecache entries.
if (!is_pmd_order(order))
return 0;
-
+ // for PMD collapse, respect the user defined maximum.
return khugepaged_max_ptes_swap;
}
@@ -478,7 +476,7 @@ int hugepage_madvise(struct vm_area_stru
* register it here without waiting a page fault that
* may not happen any time soon.
*/
- khugepaged_enter_vma(vma);
+ khugepaged_enter_vma(vma, *vm_flags);
break;
case MADV_NOHUGEPAGE:
*vm_flags &= ~VM_HUGEPAGE;
@@ -579,26 +577,26 @@ void __khugepaged_enter(struct mm_struct
/* Check what orders are allowed based on the vma and collapse type */
static unsigned long collapse_allowable_orders(struct vm_area_struct *vma,
- enum tva_type tva_flags)
+ vm_flags_t vm_flags, enum tva_type tva_flags)
{
unsigned long orders;
/* If khugepaged is scanning an anonymous vma, allow mTHP collapse */
- if ((tva_flags & TVA_KHUGEPAGED) && vma_is_anonymous(vma))
+ if ((tva_flags == TVA_KHUGEPAGED) && vma_is_anonymous(vma))
orders = THP_ORDERS_ALL_ANON;
else
orders = BIT(HPAGE_PMD_ORDER);
- return thp_vma_allowable_orders(vma, vma->vm_flags, tva_flags, orders);
+ return thp_vma_allowable_orders(vma, vm_flags, tva_flags, orders);
}
-void khugepaged_enter_vma(struct vm_area_struct *vma)
+void khugepaged_enter_vma(struct vm_area_struct *vma,
+ vm_flags_t vm_flags)
{
if (!mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) &&
- hugepage_enabled()) {
- if (collapse_allowable_orders(vma, TVA_KHUGEPAGED))
- __khugepaged_enter(vma->vm_mm);
- }
+ collapse_allowable_orders(vma, vm_flags, TVA_KHUGEPAGED) &&
+ hugepage_enabled())
+ __khugepaged_enter(vma->vm_mm);
}
void __khugepaged_exit(struct mm_struct *mm)
@@ -683,7 +681,7 @@ static enum scan_result __collapse_huge_
unsigned int max_ptes_shared = collapse_max_ptes_shared(cc, order);
if (max_ptes_none < 0)
- return result;
+ return SCAN_INVALID_PTES_NONE;
for (_pte = pte; _pte < pte + nr_pages;
_pte++, addr += PAGE_SIZE) {
@@ -905,6 +903,7 @@ static void __collapse_huge_page_copy_fa
{
const unsigned long nr_pages = 1UL << order;
spinlock_t *pmd_ptl;
+
/*
* Re-establish the PMD to point to the original page table
* entry. Restoring PMD needs to be done prior to releasing
@@ -944,6 +943,7 @@ static enum scan_result __collapse_huge_
const unsigned long nr_pages = 1UL << order;
unsigned int i;
enum scan_result result = SCAN_SUCCEED;
+
/*
* Copying pages' contents is subject to memory poison at any iteration.
*/
@@ -1263,10 +1263,20 @@ static enum scan_result alloc_charge_fol
return SCAN_SUCCEED;
}
+/*
+ * collapse_huge_page expects the mmap_read_lock to be dropped before
+ * entering this function. The function will also always return with the lock
+ * dropped. The function starts by allocation a folio, which can potentially
+ * take a long time if it involves sync compaction, and we do not need to hold
+ * the mmap_lock during that. We must recheck the vma after taking it again in
+ * write mode.
+ */
static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long start_addr,
int referenced, int unmapped, struct collapse_control *cc,
unsigned int order)
{
+ const unsigned long pmd_addr = start_addr & HPAGE_PMD_MASK;
+ const unsigned long end_addr = start_addr + (PAGE_SIZE << order);
LIST_HEAD(compound_pagelist);
pmd_t *pmd, _pmd;
pte_t *pte = NULL;
@@ -1277,8 +1287,6 @@ static enum scan_result collapse_huge_pa
struct vm_area_struct *vma;
struct mmu_notifier_range range;
bool anon_vma_locked = false;
- const unsigned long pmd_addr = start_addr & HPAGE_PMD_MASK;
- const unsigned long end_addr = start_addr + (PAGE_SIZE << order);
result = alloc_charge_folio(&folio, mm, cc, order);
if (result != SCAN_SUCCEED)
@@ -1399,11 +1407,16 @@ static enum scan_result collapse_huge_pa
__folio_mark_uptodate(folio);
spin_lock(pmd_ptl);
WARN_ON_ONCE(!pmd_none(*pmd));
- if (is_pmd_order(order)) { /* PMD collapse */
+ if (is_pmd_order(order)) {
pgtable = pmd_pgtable(_pmd);
pgtable_trans_huge_deposit(mm, pmd, pgtable);
map_anon_folio_pmd_nopf(folio, pmd, vma, pmd_addr);
- } else { /* mTHP collapse */
+ } else {
+ /*
+ * set_ptes is called in map_anon_folio_pte_nopf with the
+ * pmd_ptl lock still held; this is safe as the PMD is expected
+ * to be none. The pmd entry is then repopulated below.
+ */
map_anon_folio_pte_nopf(folio, pte, vma, start_addr, /*uffd_wp=*/ false);
smp_wmb(); /* make PTEs visible before PMD. See pmd_install() */
pmd_populate(mm, pmd, pmd_pgtable(_pmd));
@@ -1538,12 +1551,12 @@ static int mthp_collapse(struct mm_struc
case SCAN_EXCEED_SHARED_PTE:
case SCAN_PAGE_LOCK:
case SCAN_PAGE_COUNT:
- case SCAN_PAGE_LRU:
case SCAN_PAGE_NULL:
case SCAN_DEL_PAGE_LRU:
case SCAN_PTE_NON_PRESENT:
case SCAN_PTE_UFFD_WP:
case SCAN_ALLOC_HUGE_PAGE_FAIL:
+ case SCAN_PAGE_LAZYFREE:
goto next_order;
/* Cases where no further collapse is possible */
default:
@@ -1569,6 +1582,10 @@ static enum scan_result collapse_scan_pm
struct vm_area_struct *vma, unsigned long start_addr,
bool *lock_dropped, struct collapse_control *cc)
{
+ int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
+ const unsigned int max_ptes_shared = collapse_max_ptes_shared(cc, HPAGE_PMD_ORDER);
+ const unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
+ enum tva_type tva_flags = cc->is_khugepaged ? TVA_KHUGEPAGED : TVA_FORCED_COLLAPSE;
pmd_t *pmd;
pte_t *pte, *_pte, pteval;
int i;
@@ -1580,10 +1597,6 @@ static enum scan_result collapse_scan_pm
unsigned long enabled_orders;
spinlock_t *ptl;
int node = NUMA_NO_NODE, unmapped = 0;
- int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
- unsigned int max_ptes_shared = collapse_max_ptes_shared(cc, HPAGE_PMD_ORDER);
- unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
- enum tva_type tva_flags = cc->is_khugepaged ? TVA_KHUGEPAGED : TVA_FORCED_COLLAPSE;
VM_BUG_ON(start_addr & ~HPAGE_PMD_MASK);
@@ -1597,7 +1610,7 @@ static enum scan_result collapse_scan_pm
memset(cc->node_load, 0, sizeof(cc->node_load));
nodes_clear(cc->alloc_nmask);
- enabled_orders = collapse_allowable_orders(vma, tva_flags);
+ enabled_orders = collapse_allowable_orders(vma, vma->vm_flags, tva_flags);
/*
* If PMD is the only enabled order, enforce max_ptes_none, otherwise
@@ -1757,12 +1770,7 @@ static enum scan_result collapse_scan_pm
out_unmap:
pte_unmap_unlock(pte, ptl);
if (result == SCAN_SUCCEED) {
- /*
- * Before allocating the hugepage, release the mmap_lock read lock.
- * The allocation can take potentially a long time if it involves
- * sync compaction, and we do not need to hold the mmap_lock during
- * that. We will recheck the vma after taking it again in write mode.
- */
+ /* collapse_huge_page expects the lock to be dropped before calling */
mmap_read_unlock(mm);
nr_collapsed = mthp_collapse(mm, start_addr, referenced, unmapped,
cc, enabled_orders);
@@ -2657,14 +2665,14 @@ static enum scan_result collapse_scan_fi
unsigned long addr, struct file *file, pgoff_t start,
struct collapse_control *cc)
{
+ const int max_ptes_none = collapse_max_ptes_none(cc, NULL, HPAGE_PMD_ORDER);
+ const unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
struct folio *folio = NULL;
struct address_space *mapping = file->f_mapping;
XA_STATE(xas, &mapping->i_pages, start);
int present, swap;
int node = NUMA_NO_NODE;
enum scan_result result = SCAN_SUCCEED;
- int max_ptes_none = collapse_max_ptes_none(cc, NULL, HPAGE_PMD_ORDER);
- unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
present = 0;
swap = 0;
@@ -2867,7 +2875,7 @@ static void collapse_scan_mm_slot(unsign
cc->progress++;
break;
}
- if (!collapse_allowable_orders(vma, TVA_KHUGEPAGED)) {
+ if (!collapse_allowable_orders(vma, vma->vm_flags, TVA_KHUGEPAGED)) {
cc->progress++;
continue;
}
@@ -3177,7 +3185,7 @@ int madvise_collapse(struct vm_area_stru
BUG_ON(vma->vm_start > start);
BUG_ON(vma->vm_end < end);
- if (!collapse_allowable_orders(vma, TVA_FORCED_COLLAPSE))
+ if (!collapse_allowable_orders(vma, vma->vm_flags, TVA_FORCED_COLLAPSE))
return -EINVAL;
cc = kmalloc_obj(*cc);
--- a/mm/vma.c~b
+++ a/mm/vma.c
@@ -989,7 +989,7 @@ static __must_check struct vm_area_struc
goto abort;
vma_set_flags_mask(vmg->target, sticky_flags);
- khugepaged_enter_vma(vmg->target);
+ khugepaged_enter_vma(vmg->target, vmg->vm_flags);
vmg->state = VMA_MERGE_SUCCESS;
return vmg->target;
@@ -1110,7 +1110,7 @@ struct vm_area_struct *vma_merge_new_ran
* following VMA if we have VMAs on both sides.
*/
if (vmg->target && !vma_expand(vmg)) {
- khugepaged_enter_vma(vmg->target);
+ khugepaged_enter_vma(vmg->target, vmg->vm_flags);
vmg->state = VMA_MERGE_SUCCESS;
return vmg->target;
}
@@ -2589,7 +2589,7 @@ static int __mmap_new_vma(struct mmap_st
* call covers the non-merge case.
*/
if (!vma_is_anonymous(vma))
- khugepaged_enter_vma(vma);
+ khugepaged_enter_vma(vma, map->vm_flags);
*vmap = vma;
return 0;
--- a/tools/testing/vma/include/stubs.h~b
+++ a/tools/testing/vma/include/stubs.h
@@ -183,7 +183,8 @@ static inline bool mpol_equal(struct mem
return true;
}
-static inline void khugepaged_enter_vma(struct vm_area_struct *vma)
+static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
+ vm_flags_t vm_flags)
{
}
_
prev parent reply other threads:[~2026-05-11 21:04 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 18:58 [PATCH mm-unstable v17 00/14] khugepaged: mTHP support Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 01/14] mm/khugepaged: generalize hugepage_vma_revalidate for " Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 02/14] mm/khugepaged: generalize alloc_charge_folio() Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 03/14] mm/khugepaged: rework max_ptes_* handling with helper functions Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 04/14] mm/khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 05/14] mm/khugepaged: require collapse_huge_page to enter/exit with the lock dropped Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 06/14] mm/khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 07/14] mm/khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 08/14] mm/khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 09/14] mm/khugepaged: improve tracepoints for mTHP orders Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 10/14] mm/khugepaged: introduce collapse_allowable_orders helper function Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 11/14] mm/khugepaged: Introduce mTHP collapse support Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 12/14] mm/khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 13/14] mm/khugepaged: run khugepaged for all orders Nico Pache
2026-05-11 18:58 ` [PATCH mm-unstable v17 14/14] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2026-05-11 21:04 ` Andrew Morton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260511140440.d8a71b4774d13537b3977d19@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=aarcange@redhat.com \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=byungchul@sk.com \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jackmanb@google.com \
--cc=jannh@google.com \
--cc=jglisse@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=kas@kernel.org \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=matthew.brost@intel.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=peterx@redhat.com \
--cc=pfalcato@suse.de \
--cc=rakie.kim@sk.com \
--cc=raquini@redhat.com \
--cc=rdunlap@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shivankg@amd.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tiwai@suse.de \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=ying.huang@linux.alibaba.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox