Re: [PATCH v8 mm-new 02/12] mm: thp: remove vm_flags parameter from khugepaged_enter_vma()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Usama Arif <usamaarif642@gmail.com>
To: Yafang Shao <laoar.shao@gmail.com>,
	akpm@linux-foundation.org, david@redhat.com, ziy@nvidia.com,
	baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, hannes@cmpxchg.org,
	gutierrez.asier@huawei-partners.com, willy@infradead.org,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	ameryhung@gmail.com, rientjes@google.com, corbet@lwn.net,
	21cnbao@gmail.com, shakeel.butt@linux.dev, tj@kernel.org,
	lance.yang@linux.dev
Cc: bpf@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yang Shi <shy828301@gmail.com>
Subject: Re: [PATCH v8 mm-new 02/12] mm: thp: remove vm_flags parameter from khugepaged_enter_vma()
Date: Fri, 26 Sep 2025 15:49:59 +0100	[thread overview]
Message-ID: <146b95bd-e0f0-4e6b-a9fa-5a8f11355268@gmail.com> (raw)
In-Reply-To: <20250926093343.1000-3-laoar.shao@gmail.com>



On 26/09/2025 10:33, Yafang Shao wrote:
> The khugepaged_enter_vma() function requires handling in two specific
> scenarios:
> 1. New VMA creation
>   When a new VMA is created, if vma->vm_mm is not present in
>   khugepaged_mm_slot, it must be added. In this case,
>   khugepaged_enter_vma() is called after vma->vm_flags have been set,
>   allowing direct use of the VMA's flags.
> 2. VMA flag modification
>   When vma->vm_flags are modified (particularly when VM_HUGEPAGE is set),
>   the system must recheck whether to add vma->vm_mm to khugepaged_mm_slot.
>   Currently, khugepaged_enter_vma() is called before the flag update, so
>   the call must be relocated to occur after vma->vm_flags have been set.
> 
> Additionally, khugepaged_enter_vma() is invoked in other contexts, such as
> during VMA merging. However, these calls are unnecessary because the
> existing VMA already ensures that vma->vm_mm is registered in
> khugepaged_mm_slot. While removing these redundant calls represents a
> potential optimization, that change should be addressed separately.
> Because VMA merging only occurs when the vm_flags of both VMAs are
> identical (excluding special flags like VM_SOFTDIRTY), we can safely use
> target->vm_flags instead.
> 

The patch looks good to me, but if we are sure that khugepaged_enter_vma
is not needed in VMA merging case, we should remove it in this patch itself.
If the reason we are removing what flags are being considered when calling
khugepaged_enter_vma in VMA merging case is because the calls are unnecessary,
then we should just remove the calls and not modify them
(if its safe and functionally correct :))

> After this change, we can further remove vm_flags parameter from
> thp_vma_allowable_order(). That will be handled in a followup patch.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Yang Shi <shy828301@gmail.com>
> ---
>  include/linux/khugepaged.h |  6 ++----
>  mm/huge_memory.c           |  2 +-
>  mm/khugepaged.c            | 11 ++---------
>  mm/madvise.c               |  7 +++++++
>  mm/vma.c                   |  6 +++---
>  5 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index f14680cd9854..b30814d3d665 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -13,8 +13,7 @@ extern void khugepaged_destroy(void);
>  extern int start_stop_khugepaged(void);
>  extern void __khugepaged_enter(struct mm_struct *mm);
>  extern void __khugepaged_exit(struct mm_struct *mm);
> -extern void khugepaged_enter_vma(struct vm_area_struct *vma,
> -				 vm_flags_t vm_flags);
> +extern void khugepaged_enter_vma(struct vm_area_struct *vma);
>  extern void khugepaged_enter_mm(struct mm_struct *mm);
>  extern void khugepaged_min_free_kbytes_update(void);
>  extern bool current_is_khugepaged(void);
> @@ -39,8 +38,7 @@ static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm
>  static inline void khugepaged_exit(struct mm_struct *mm)
>  {
>  }
> -static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
> -					vm_flags_t vm_flags)
> +static inline void khugepaged_enter_vma(struct vm_area_struct *vma)
>  {
>  }
>  static inline void khugepaged_enter_mm(struct mm_struct *mm)
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1b81680b4225..ac6601f30e65 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1346,7 +1346,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
>  	ret = vmf_anon_prepare(vmf);
>  	if (ret)
>  		return ret;
> -	khugepaged_enter_vma(vma, vma->vm_flags);
> +	khugepaged_enter_vma(vma);
>  
>  	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
>  			!mm_forbids_zeropage(vma->vm_mm) &&
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index f47ac8c19447..04121ae7d18d 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -353,12 +353,6 @@ int hugepage_madvise(struct vm_area_struct *vma,
>  #endif
>  		*vm_flags &= ~VM_NOHUGEPAGE;
>  		*vm_flags |= VM_HUGEPAGE;
> -		/*
> -		 * If the vma become good for khugepaged to scan,
> -		 * register it here without waiting a page fault that
> -		 * may not happen any time soon.
> -		 */
> -		khugepaged_enter_vma(vma, *vm_flags);
>  		break;
>  	case MADV_NOHUGEPAGE:
>  		*vm_flags &= ~VM_HUGEPAGE;
> @@ -467,10 +461,9 @@ void khugepaged_enter_mm(struct mm_struct *mm)
>  	__khugepaged_enter(mm);
>  }
>  
> -void khugepaged_enter_vma(struct vm_area_struct *vma,
> -			  vm_flags_t vm_flags)
> +void khugepaged_enter_vma(struct vm_area_struct *vma)
>  {
> -	if (!thp_vma_allowable_order(vma, vm_flags, TVA_KHUGEPAGED, PMD_ORDER))
> +	if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_KHUGEPAGED, PMD_ORDER))
>  		return;
>  
>  	khugepaged_enter_mm(vma->vm_mm);
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 35ed4ab0d7c5..ab8b5d47badb 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1425,6 +1425,13 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
>  	VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
>  
>  	error = madvise_update_vma(new_flags, madv_behavior);
> +	/*
> +	 * If the vma become good for khugepaged to scan,
> +	 * register it here without waiting a page fault that
> +	 * may not happen any time soon.
> +	 */
> +	if (!error && new_flags & VM_HUGEPAGE)
> +		khugepaged_enter_mm(vma->vm_mm);
>  out:
>  	/*
>  	 * madvise() returns EAGAIN if kernel resources, such as
> diff --git a/mm/vma.c b/mm/vma.c
> index a1ec405bda25..6a548b0d64cd 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -973,7 +973,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
>  	if (err || commit_merge(vmg))
>  		goto abort;
>  
> -	khugepaged_enter_vma(vmg->target, vmg->vm_flags);
> +	khugepaged_enter_vma(vmg->target);
>  	vmg->state = VMA_MERGE_SUCCESS;
>  	return vmg->target;
>  
> @@ -1093,7 +1093,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
>  	 * following VMA if we have VMAs on both sides.
>  	 */
>  	if (vmg->target && !vma_expand(vmg)) {
> -		khugepaged_enter_vma(vmg->target, vmg->vm_flags);
> +		khugepaged_enter_vma(vmg->target);
>  		vmg->state = VMA_MERGE_SUCCESS;
>  		return vmg->target;
>  	}
> @@ -2520,7 +2520,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
>  	 * call covers the non-merge case.
>  	 */
>  	if (!vma_is_anonymous(vma))
> -		khugepaged_enter_vma(vma, map->vm_flags);
> +		khugepaged_enter_vma(vma);
>  	*vmap = vma;
>  	return 0;
>

next prev parent reply	other threads:[~2025-09-26 14:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-26  9:33 [PATCH v8 mm-new 00/12] mm, bpf: BPF based THP order selection Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 01/12] mm: thp: remove disabled task from khugepaged_mm_slot Yafang Shao
2025-09-26 14:11   ` Usama Arif
2025-09-28  2:21     ` Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 02/12] mm: thp: remove vm_flags parameter from khugepaged_enter_vma() Yafang Shao
2025-09-26 14:49   ` Usama Arif [this message]
2025-09-28  2:35     ` Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 03/12] mm: thp: remove vm_flags parameter from thp_vma_allowable_order() Yafang Shao
2025-09-26 14:54   ` Usama Arif
2025-09-26  9:33 ` [PATCH v8 mm-new 04/12] mm: thp: add support for BPF based THP order selection Yafang Shao
2025-09-26 15:13   ` Usama Arif
2025-09-26 19:17     ` Randy Dunlap
2025-09-28  2:13     ` Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 05/12] mm: thp: decouple THP allocation between swap and page fault paths Yafang Shao
2025-09-26 15:19   ` Usama Arif
2025-09-26  9:33 ` [PATCH v8 mm-new 06/12] mm: thp: enable THP allocation exclusively through khugepaged Yafang Shao
2025-09-26 15:27   ` Usama Arif
2025-09-28  2:58     ` Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 07/12] bpf: mark mm->owner as __safe_rcu_or_null Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 08/12] bpf: mark vma->vm_mm as __safe_trusted_or_null Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 09/12] selftests/bpf: add a simple BPF based THP policy Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 10/12] selftests/bpf: add test case to update " Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 11/12] selftests/bpf: add test cases for invalid thp_adjust usage Yafang Shao
2025-09-26  9:33 ` [PATCH v8 mm-new 12/12] Documentation: add BPF-based THP policy management Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=146b95bd-e0f0-4e6b-a9fa-5a8f11355268@gmail.com \
    --to=usamaarif642@gmail.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=gutierrez.asier@huawei-partners.com \
    --cc=hannes@cmpxchg.org \
    --cc=lance.yang@linux.dev \
    --cc=laoar.shao@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=rientjes@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shy828301@gmail.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.