The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Lian Wang <lianux.mm@gmail.com>
Cc: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org,
	 daichaobing@sangfor.com.cn, kunwu.chan@gmail.com,
	Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>, Zi Yan <ziy@nvidia.com>,
	 Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <liam@infradead.org>,
	 Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	 Barry Song <baohua@kernel.org>,
	Lance Yang <lance.yang@linux.dev>,
	linux-kernel@vger.kernel.org
Subject: Re: [RESEND RFC PATCH v2 2/5] mm/khugepaged: add damon_collapse_folio_range() for external callers
Date: Thu, 2 Jul 2026 12:07:01 +0100	[thread overview]
Message-ID: <akZDj6s8U5_Mnetv@lucifer> (raw)
In-Reply-To: <20260702094633.75658-3-lianux.mm@gmail.com>

(+cc missing people again)

Sorry but we're not going to accept anything that exports THP logic like this at
all.

And a damon wrapper in core mm code is just a non-starter, so you really need to
rethink your approach.

I think SJ already commented on this in your v1 from what I can see? I'd listen
to his advice on this :)

On Thu, Jul 02, 2026 at 05:46:30PM +0800, Lian Wang wrote:
> Export a thin wrapper around collapse_huge_page() that allows external
> subsystems such as DAMON to trigger THP collapse on a target address
> range.
>
> Currently restricted to PMD order (HPAGE_PMD_ORDER), since
> collapse_huge_page() does not yet support arbitrary mTHP orders.
> The restriction can be relaxed when khugepaged gains mTHP support.
>
> The caller must hold a reference to @mm.  Do not hold mmap lock:
> collapse_huge_page() acquires mmap_read_lock for validation, releases
> it, then acquires mmap_write_lock for the actual collapse.  Holding
> an outer mmap_read_lock would cause a self-deadlock when the same
> thread attempts the inner mmap_write_lock.
>
> Co-developed-by: Kunwu Chan <kunwu.chan@gmail.com>
> Signed-off-by: Kunwu Chan <kunwu.chan@gmail.com>
> Signed-off-by: Lian Wang <lianux.mm@gmail.com>
> Signed-off-by: Lian Wang <lianux.wang@processmission.com>

This patch is exporting internal mm logic without proper safeguards so it's just
not something we're going to accept, sorry.

(Also not sure it's correct to have multiple S-o-b for the same person (unless
re-tagging a backport or something)? I'm not sure though)

> ---
>  include/linux/khugepaged.h |  9 ++++++++
>  mm/khugepaged.c            | 46 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+)
>
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index d7a9053ff4fe..f7d49cba712f 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -20,6 +20,9 @@ extern bool current_is_khugepaged(void);
>  void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
>  		bool install_pmd);
>
> +int damon_collapse_folio_range(struct mm_struct *mm, unsigned long start_addr,
> +			       unsigned int target_order);

No thanks. We're not putting damon wrappers into core code. This is breaking the
abstraction and letting arbitrary users invoke internal mm logic.

Plus you're literally exporting this so it can be abused by drivers.

> +
>  static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
>  {
>  	if (mm_flags_test(MMF_VM_HUGEPAGE, oldmm))
> @@ -47,6 +50,12 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
>  {
>  }
>
> +static inline int damon_collapse_folio_range(struct mm_struct *mm,
> +		unsigned long start_addr, unsigned int target_order)
> +{
> +	return -EINVAL;
> +}
> +
>  static inline void khugepaged_min_free_kbytes_update(void)
>  {
>  }
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 617bca76db49..7fe9ce1e0533 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -3272,3 +3272,49 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>  	return thps == ((hend - hstart) >> HPAGE_PMD_SHIFT) ? 0
>  			: madvise_collapse_errno(last_fail);
>  }
> +
> +/**
> + * damon_collapse_folio_range() - Collapse base pages in range into a THP
> + * @mm:         mm_struct of the target process
> + * @start_addr: start address (must be order-aligned)
> + * @target_order: page order of the collapse result (currently only
> + *                HPAGE_PMD_ORDER is supported)
> + *
> + * Thin wrapper around collapse_huge_page() for external callers such as
> + * DAMON.  The caller must hold a reference to @mm.  Do not hold mmap

This is really fragile and bug bait.

> + * lock: collapse_huge_page() acquires mmap_read_lock for validation,

This is just gross, you're now collapsing based on an outdated concept of
what the current VMA state is...

You're also losing literally everything that madvise_collapse() does.

AND you're overriding THP limitations like max_ptes_none, which is horrible...

> + * releases it, then acquires mmap_write_lock for the collapse.  Holding
> + * an outer mmap_read_lock would self-deadlock.

This is a sign the interface is wrong!

> + *
> + * Return: 0 on success, -EINVAL on bad arguments, negative error from
> + *         madvise_collapse_errno() otherwise.
> + */
> +int damon_collapse_folio_range(struct mm_struct *mm, unsigned long start_addr,
> +			       unsigned int target_order)
> +{
> +	struct collapse_control *cc;
> +	enum scan_result result;
> +
> +	if (target_order != HPAGE_PMD_ORDER) {
> +		pr_warn_once("%s: only PMD order (%u) is supported, got %u\n",
> +			     __func__, HPAGE_PMD_ORDER, target_order);
> +		return -EINVAL;
> +	}
> +	if (start_addr & ((PAGE_SIZE << target_order) - 1))
> +		return -EINVAL;
> +
> +	cc = kmalloc_obj(*cc);
> +	if (!cc)
> +		return -ENOMEM;
> +	cc->is_khugepaged = false;
> +	cc->progress = 0;
> +
> +	lru_add_drain_all();

This is quite literally a copy/paste from madvise_collapse(). No no no :) code
duplication like this is also unacceptable.

> +
> +	result = collapse_huge_page(mm, start_addr, 1, 0, cc, target_order);
> +	kfree(cc);
> +	if (result == SCAN_SUCCEED || result == SCAN_PMD_MAPPED)
> +		return 0;
> +	return madvise_collapse_errno(result);
> +}
> +EXPORT_SYMBOL_GPL(damon_collapse_folio_range);

We _definitely_ cannot have internal mm logic _exported_.

Yeah sorry you need to rethink this.

Thanks, Lorenzo

  parent reply	other threads:[~2026-07-02 11:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260702094633.75658-1-lianux.mm@gmail.com>
2026-07-02 10:23 ` [RESEND RFC PATCH v2 0/5] mm/damon: add mTHP collapse and split actions Lorenzo Stoakes
2026-07-02 16:52   ` SJ Park
     [not found] ` <20260702094633.75658-3-lianux.mm@gmail.com>
2026-07-02 11:07   ` Lorenzo Stoakes [this message]
2026-07-02 19:43     ` [RESEND RFC PATCH v2 2/5] mm/khugepaged: add damon_collapse_folio_range() for external callers SJ Park
2026-07-02 20:32       ` SJ Park
2026-07-02  9:52 [RESEND RFC PATCH v2 0/5] mm/damon: add mTHP collapse and split actions Lian Wang
2026-07-02  9:52 ` [RESEND RFC PATCH v2 2/5] mm/khugepaged: add damon_collapse_folio_range() for external callers Lian Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akZDj6s8U5_Mnetv@lucifer \
    --to=ljs@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=daichaobing@sangfor.com.cn \
    --cc=damon@lists.linux.dev \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=kunwu.chan@gmail.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=lianux.mm@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=sj@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox