From: sashiko-bot@kernel.org
To: "wang lian" <lianux.mm@gmail.com>
Cc: damon@lists.linux.dev
Subject: Re: [PATCH v2 2/5] mm/khugepaged: add damon_collapse_folio_range() for external callers
Date: Wed, 01 Jul 2026 12:02:26 +0000 [thread overview]
Message-ID: <20260701120226.DD8811F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260701114716.56503-3-lianux.mm@gmail.com>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Allocating the large `struct collapse_control` directly on the kernel stack risks stack overflow.
- [High] Bypassing the PMD pre-scan phase violates khugepaged invariants, causing logic errors, performance degradation, and potential crashes.
- [Medium] Missing `lru_add_drain_all()` before collapsing causes spurious collapse failures due to inflated page refcounts.
- [Medium] Successful pre-existing THP state (`SCAN_PMD_MAPPED`) is incorrectly translated into a hard `-EINVAL` error.
--
commit bb75d3df2a4ada9ebcc7237ae7417a21c9fbc4f8
Author: Wang Lian <lianux.mm@gmail.com>
mm/khugepaged: add damon_collapse_folio_range() for external callers
Export a wrapper around collapse_huge_page() that allows external
subsystems such as DAMON to trigger THP collapse on a target address range.
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 617bca76db49b..0387841ba2e7d 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -3272,3 +3272,42 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> return thps == ((hend - hstart) >> HPAGE_PMD_SHIFT) ? 0
> : madvise_collapse_errno(last_fail);
> }
> +
> +/**
> + * damon_collapse_folio_range() - Collapse base pages in range into a THP
> + * @mm: mm_struct of the target process
> + * @start_addr: start address (must be order-aligned)
> + * @target_order: page order of the collapse result (currently only
> + * HPAGE_PMD_ORDER is supported)
> + *
> + * Thin wrapper around collapse_huge_page() for external callers such as
> + * DAMON. The caller must hold a reference to @mm. Do not hold mmap
> + * lock: collapse_huge_page() acquires mmap_read_lock for validation,
> + * releases it, then acquires mmap_write_lock for the collapse. Holding
> + * an outer mmap_read_lock would self-deadlock.
> + *
> + * Return: 0 on success, -EINVAL on bad arguments, negative error from
> + * madvise_collapse_errno() otherwise.
> + */
> +int damon_collapse_folio_range(struct mm_struct *mm, unsigned long start_addr,
> + unsigned int target_order)
> +{
> + struct collapse_control cc = {
> + .is_khugepaged = false,
> + };
[Severity: High]
Does allocating struct collapse_control directly on the stack risk a stack
overflow?
The collapse_control structure contains a large array:
struct collapse_control {
...
u32 node_load[MAX_NUMNODES];
...
};
With CONFIG_NODES_SHIFT=10 on x86_64, MAX_NUMNODES is 1024, making the struct
larger than 4KB. Allocating this directly on the stack could overflow the
kernel stack.
Should this be heap-allocated with kmalloc_obj() as done in
madvise_collapse()?
> + enum scan_result result;
> +
> + if (target_order != HPAGE_PMD_ORDER) {
> + pr_warn_once("%s: only PMD order (%u) is supported, got %u\n",
> + __func__, HPAGE_PMD_ORDER, target_order);
> + return -EINVAL;
> + }
> + if (start_addr & ((PAGE_SIZE << target_order) - 1))
> + return -EINVAL;
> +
> + result = collapse_huge_page(mm, start_addr, 1, 0, &cc, target_order);
[Severity: High]
Is it safe to bypass the collapse_scan_pmd() pre-scan phase and directly call
collapse_huge_page()?
By skipping the pre-scan, several issues appear to occur:
First, collapse_huge_page() unconditionally allocates and charges a 2MB folio
before validating the VMA:
static enum scan_result collapse_huge_page(...)
{
...
result = alloc_charge_folio(&folio, mm, cc, order);
...
}
Second, because cc.node_load is never populated by the scan,
collapse_find_target_node() always defaults to node 0, breaking NUMA locality.
Finally, if a non-anonymous page is legitimately mapped into an anonymous VMA,
__collapse_huge_page_isolate() could trigger a kernel panic via:
VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio);
Does this sequence need to invoke the pre-scan phase before collapsing?
[Severity: Medium]
Will this reliably fail to collapse pages that are currently in per-CPU LRU
batches?
Without calling lru_add_drain_all() before collapse_huge_page(), transient
references held by the LRU batches inflate the page reference counts. The
strict exclusivity check in __collapse_huge_page_isolate() will then return
SCAN_PAGE_COUNT, resulting in a spurious -EAGAIN failure.
> + if (result == SCAN_SUCCEED)
> + return 0;
> + return madvise_collapse_errno(result);
[Severity: Medium]
Does this code handle SCAN_PMD_MAPPED correctly when memory is already backed
by a THP?
If the range is already PMD-mapped, collapse_huge_page() detects this in
find_pmd_or_thp_or_none() and returns SCAN_PMD_MAPPED. Because this function
only checks for SCAN_SUCCEED, SCAN_PMD_MAPPED falls through to
madvise_collapse_errno(), which lacks a case for it and returns -EINVAL.
Should SCAN_PMD_MAPPED be treated as a success case?
> +}
> +EXPORT_SYMBOL_GPL(damon_collapse_folio_range);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701114716.56503-1-lianux.mm@gmail.com?part=2
next prev parent reply other threads:[~2026-07-01 12:02 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 11:47 [PATCH v2 0/5] mm/damon: add mTHP collapse and split actions wang lian
2026-07-01 11:47 ` [PATCH v2 1/5] mm/damon: add target_order field for DAMOS_COLLAPSE wang lian
2026-07-01 12:07 ` sashiko-bot
2026-07-01 11:47 ` [PATCH v2 2/5] mm/khugepaged: add damon_collapse_folio_range() for external callers wang lian
2026-07-01 12:02 ` sashiko-bot [this message]
2026-07-01 11:47 ` [PATCH v2 3/5] mm/damon/vaddr: implement mTHP-aware DAMOS_COLLAPSE handler wang lian
2026-07-01 12:02 ` sashiko-bot
2026-07-01 11:47 ` [PATCH v2 4/5] mm/damon: introduce DAMOS_SPLIT action wang lian
2026-07-01 12:04 ` sashiko-bot
2026-07-01 11:47 ` [PATCH v2 5/5] mm/damon/vaddr: implement DAMOS_SPLIT handler wang lian
2026-07-01 11:57 ` sashiko-bot
2026-07-01 13:52 ` [PATCH v2 0/5] mm/damon: add mTHP collapse and split actions SJ Park
2026-07-02 6:52 ` wang lian
2026-07-02 16:10 ` SJ Park
2026-07-02 7:02 ` wang lian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260701120226.DD8811F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=damon@lists.linux.dev \
--cc=lianux.mm@gmail.com \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox