public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Zi Yan <ziy@nvidia.com>, "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	Nico Pache <npache@redhat.com>, Song Liu <songliubraving@fb.com>,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 7.2 v2 05/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_pmd_enabled()
Date: Wed, 15 Apr 2026 14:36:40 +0800	[thread overview]
Message-ID: <b7aa5515-6908-4ec6-91e9-eb79d5ef70b4@linux.alibaba.com> (raw)
In-Reply-To: <7468C68E-FB09-4714-94A3-4BED63453295@nvidia.com>



On 4/15/26 2:25 AM, Zi Yan wrote:
> On 14 Apr 2026, at 14:14, David Hildenbrand (Arm) wrote:
> 
>> On 4/14/26 18:30, Zi Yan wrote:
>>> On 14 Apr 2026, at 7:02, David Hildenbrand (Arm) wrote:
>>>
>>>> On 4/13/26 22:42, Zi Yan wrote:
>>>>>
>>>>>
>>>>
>>>> I assume such a change should come before patch #4, as it seems to affect
>>>> the functionality that depended on CONFIG_READ_ONLY_THP_FOR_FS.
>>>
>>> If the goal is to have a knob of khugepaged for all files, yes I will move
>>> the change before Patch 4.
>>>
>>>>
>>>>> I thought about this, but it means khugepaged is turned on regardless of
>>>>> anon and shmem configs. I tend to think the original code was a bug,
>>>>> since enabling CONFIG_READ_ONLY_THP_FOR_FS would enable khugepaged all
>>>>> the time.
>>>>
>>>> There might be some FS mapping to collapse? So that makes sense to
>>>> some degree.
>>>>
>>>> I really don't like the side-effects of "/sys/kernel/mm/transparent_hugepage/enabled".
>>>> Like, enabling khugepaged+PMD for files.
>>>>
>>>
>>> I am not a fan either, but I was not sure about another sysfs knob.
>>>
>>
>> Yeah, it would be better if we could avoid it. But the dependency on the
>> global toggle as it is today is a bit weird.
>>
>>>>>
>>>>>
>>>>> Alternatives could be:
>>>>> 1. to add a file-backed khhugepaged config, but another sysfs?
>>>>
>>>> Maybe that would be the time to decouple file THP logic from
>>>> hugepage_global_enabled()/hugepage_global_always().
>>>>
>>>> In particular, as pagecache folio allocation doesn't really care about __thp_vma_allowable_orders() IIRC.
>>>>
>>>> I'm thinking about something like the following:
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index b2a6060b3c20..fb3a4fd84fe0 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -184,15 +184,6 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>>>                                                     forced_collapse);
>>>>
>>>>          if (!vma_is_anonymous(vma)) {
>>>> -               /*
>>>> -                * Enforce THP collapse requirements as necessary. Anonymous vmas
>>>> -                * were already handled in thp_vma_allowable_orders().
>>>> -                */
>>>> -               if (!forced_collapse &&
>>>> -                   (!hugepage_global_enabled() || (!(vm_flags & VM_HUGEPAGE) &&
>>>> -                                                   !hugepage_global_always())))
>>>> -                       return 0;
>>>> -
>>>>                  /*
>>>>                   * Trust that ->huge_fault() handlers know what they are doing
>>>>                   * in fault path.
>>>
>>> Looks reasonable.
>>
>> I don't think there is other interaction with FS and the global toggle
>> besides this and the one you are adjusting, right?
>>
>>>
>>>>
>>>> Then, we might indeed just want a khugepaged toggle whether to enable it at
>>>> all in files. (or just a toggle to disable khugeapged entirely?)
>>>>
>>>
>>> I think hugepage_global_enabled() should be enough to decide whether khugepaged
>>> should run or not.

I'm afraid not. Please also consider the per-size mTHP interfaces. It's 
possible that hugepage_global_enabled() returns false, but 
hugepages-2048kB/enabled is set to "always", which would still allow 
khugepaged to collapse folios.

>> That would also be an option and would likely avoid other toggles.
>>
>> So __thp_vma_allowable_orders() would allows THPs in any case for FS,
>> but hugepage_global_enabled() would control whether khugepaged runs (for
>> fs).
>>
>> It gives less flexibility, but likely that's ok.
>>
>>>
>>> Currently, we have thp_vma_allowable_orders() to filter each VMAs and I do not
>>> see a reason to use hugepage_pmd_enabled() to guard khugepaged daemon. I am
>>> going to just remove hugepage_pmd_enabled() and replace it with
>>> hugepage_global_enabled(). Let me know your thoughts.
>>
>> Can you send a quick draft of what you have in mind?
> 
>  From ee9e1c18b41111db7248db7fb64693b91e32255d Mon Sep 17 00:00:00 2001
> From: Zi Yan <ziy@nvidia.com>
> Date: Tue, 14 Apr 2026 14:17:31 -0400
> Subject: [PATCH] mm/khugepaged: replace hugepage_pmd_enabled with
>   hugepage_global_enabled
> 
> thp_vma_allowable_orders() is used to guard khugepaged scanning logic in
> collapse_scan_mm_slot() based on enabled THP/mTHP orders by only allowing
> PMD_ORDER. hugepage_pmd_enabled() is a duplication of it for khugepaged
> start/stop control. Simplify the control by checking
> hugepage_global_enabled() instead and let thp_vma_allowable_orders() filter
> khugepaged scanning.

It appears this would prevent shmem collapse, since 
hugepage_global_enabled() doesn’t consider the THP settings for 
shmem/tmpfs (only for anonymous memory).

> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>   mm/khugepaged.c | 36 ++++++------------------------------
>   1 file changed, 6 insertions(+), 30 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b8452dbdb043..459c486a5a75 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -406,30 +406,6 @@ static inline int collapse_test_exit_or_disable(struct mm_struct *mm)
>   		mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm);
>   }
> 
> -static bool hugepage_pmd_enabled(void)
> -{
> -	/*
> -	 * We cover the anon, shmem and the file-backed case here; file-backed
> -	 * hugepages, when configured in, are determined by the global control.
> -	 * Anon pmd-sized hugepages are determined by the pmd-size control.
> -	 * Shmem pmd-sized hugepages are also determined by its pmd-size control,
> -	 * except when the global shmem_huge is set to SHMEM_HUGE_DENY.
> -	 */
> -	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
> -	    hugepage_global_enabled())
> -		return true;
> -	if (test_bit(PMD_ORDER, &huge_anon_orders_always))
> -		return true;
> -	if (test_bit(PMD_ORDER, &huge_anon_orders_madvise))
> -		return true;
> -	if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) &&
> -	    hugepage_global_enabled())
> -		return true;
> -	if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled())
> -		return true;
> -	return false;
> -}
> -
>   void __khugepaged_enter(struct mm_struct *mm)
>   {
>   	struct mm_slot *slot;
> @@ -463,7 +439,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>   			  vm_flags_t vm_flags)
>   {
>   	if (!mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) &&
> -	    hugepage_pmd_enabled()) {
> +	    hugepage_global_enabled()) {
>   		if (thp_vma_allowable_order(vma, vm_flags, TVA_KHUGEPAGED, PMD_ORDER))
>   			__khugepaged_enter(vma->vm_mm);
>   	}
> @@ -2599,7 +2575,7 @@ static void collapse_scan_mm_slot(unsigned int progress_max,
> 
>   static int khugepaged_has_work(void)
>   {
> -	return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled();
> +	return !list_empty(&khugepaged_scan.mm_head) && hugepage_global_enabled();
>   }
> 
>   static int khugepaged_wait_event(void)
> @@ -2672,7 +2648,7 @@ static void khugepaged_wait_work(void)
>   		return;
>   	}
> 
> -	if (hugepage_pmd_enabled())
> +	if (hugepage_global_enabled())
>   		wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
>   }
> 
> @@ -2703,7 +2679,7 @@ void set_recommended_min_free_kbytes(void)
>   	int nr_zones = 0;
>   	unsigned long recommended_min;
> 
> -	if (!hugepage_pmd_enabled()) {
> +	if (!hugepage_global_enabled()) {
>   		calculate_min_free_kbytes();
>   		goto update_wmarks;
>   	}
> @@ -2753,7 +2729,7 @@ int start_stop_khugepaged(void)
>   	int err = 0;
> 
>   	mutex_lock(&khugepaged_mutex);
> -	if (hugepage_pmd_enabled()) {
> +	if (hugepage_global_enabled()) {
>   		if (!khugepaged_thread)
>   			khugepaged_thread = kthread_run(khugepaged, NULL,
>   							"khugepaged");
> @@ -2779,7 +2755,7 @@ int start_stop_khugepaged(void)
>   void khugepaged_min_free_kbytes_update(void)
>   {
>   	mutex_lock(&khugepaged_mutex);
> -	if (hugepage_pmd_enabled() && khugepaged_thread)
> +	if (hugepage_global_enabled() && khugepaged_thread)
>   		set_recommended_min_free_kbytes();
>   	mutex_unlock(&khugepaged_mutex);
>   }


  reply	other threads:[~2026-04-15  6:36 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 19:20 [PATCH 7.2 v2 00/12] Remove read-only THP support for FSes without large folio support Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 01/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check Zi Yan
2026-04-13 20:20   ` Matthew Wilcox
2026-04-13 20:34     ` Zi Yan
2026-04-14 10:19       ` David Hildenbrand (Arm)
2026-04-14 10:20       ` David Hildenbrand (Arm)
2026-04-15  6:09       ` Baolin Wang
2026-04-14 10:29   ` David Hildenbrand (Arm)
2026-04-14 15:37     ` Lance Yang
2026-04-14 15:43       ` Lance Yang
2026-04-14 15:59         ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 02/12] mm/khugepaged: add folio dirty check after try_to_unmap_flush() Zi Yan
2026-04-13 20:23   ` Matthew Wilcox
2026-04-13 20:28     ` Zi Yan
2026-04-14 10:38   ` David Hildenbrand (Arm)
2026-04-14 15:55     ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 03/12] mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled() Zi Yan
2026-04-14 10:40   ` David Hildenbrand (Arm)
2026-04-14 15:59     ` Zi Yan
2026-04-15  6:17   ` Baolin Wang
2026-04-13 19:20 ` [PATCH 7.2 v2 04/12] mm: remove READ_ONLY_THP_FOR_FS Kconfig option Zi Yan
2026-04-14 10:40   ` David Hildenbrand (Arm)
2026-04-15  6:20   ` Baolin Wang
2026-04-13 19:20 ` [PATCH 7.2 v2 05/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_pmd_enabled() Zi Yan
2026-04-13 20:33   ` Matthew Wilcox
2026-04-13 20:42     ` Zi Yan
2026-04-14 11:02       ` David Hildenbrand (Arm)
2026-04-14 16:30         ` Zi Yan
2026-04-14 18:14           ` David Hildenbrand (Arm)
2026-04-14 18:25             ` Zi Yan
2026-04-15  6:36               ` Baolin Wang [this message]
2026-04-15  8:00                 ` David Hildenbrand (Arm)
2026-04-15  9:21                   ` Baolin Wang
2026-04-15 18:01                     ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 06/12] mm: fs: remove filemap_nr_thps*() functions and their users Zi Yan
2026-04-13 20:35   ` Matthew Wilcox
2026-04-14 11:02   ` David Hildenbrand (Arm)
2026-04-15  6:53   ` Baolin Wang
2026-04-13 19:20 ` [PATCH 7.2 v2 07/12] fs: remove nr_thps from struct address_space Zi Yan
2026-04-13 20:38   ` Matthew Wilcox
2026-04-15  6:44   ` Baolin Wang
2026-04-13 19:20 ` [PATCH 7.2 v2 08/12] mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS Zi Yan
2026-04-13 20:41   ` Matthew Wilcox
2026-04-13 20:46     ` Zi Yan
2026-04-14 11:03       ` David Hildenbrand (Arm)
2026-04-15  6:47   ` Baolin Wang
2026-04-13 19:20 ` [PATCH 7.2 v2 09/12] mm/truncate: use folio_split() in truncate_inode_partial_folio() Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 10/12] fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS Zi Yan
2026-04-14 11:06   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 11/12] selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged Zi Yan
2026-04-14 11:06   ` David Hildenbrand (Arm)
2026-04-13 19:20 ` [PATCH 7.2 v2 12/12] selftests/mm: remove READ_ONLY_THP_FOR_FS from comments in guard-regions Zi Yan
2026-04-13 20:47   ` Matthew Wilcox
2026-04-13 20:51     ` Zi Yan
2026-04-13 22:28       ` Matthew Wilcox
2026-04-14 11:09         ` David Hildenbrand (Arm)
2026-04-14 16:45           ` Zi Yan
2026-04-14 17:40             ` Matthew Wilcox
2026-04-14 17:53               ` Zi Yan
2026-04-14 11:07   ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7aa5515-6908-4ec6-91e9-eb79d5ef70b4@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=brauner@kernel.org \
    --cc=clm@fb.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=dsterba@suse.com \
    --cc=jack@suse.cz \
    --cc=lance.yang@linux.dev \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox