linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Usama Arif <usamaarif642@gmail.com>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Zi Yan <ziy@nvidia.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>
Subject: Re: [PATCH v1] mm/huge_memory: fix shrinking of all-zero THPs with max_ptes_none default
Date: Fri, 5 Sep 2025 16:46:21 +0200	[thread overview]
Message-ID: <1aa5818f-eb75-4aee-a866-9d2f81111056@redhat.com> (raw)
In-Reply-To: <06874db5-80f2-41a0-98f1-35177f758670@gmail.com>

[...]

> 
> The reason I did this is for the case if you change max_ptes_none after the THP is added
> to deferred split list but *before* memory pressure, i.e. before the shrinker runs,
> so that its considered for splitting.

Yeah, I was assuming that was the reason why the shrinker is enabled as 
default.

But in any sane system, the admin would enable the shrinker early. If 
not, we can look into handling it differently.

> 
>> Easy to reproduce:
>>
>> 1) Allocate some THPs filled with 0s
>>
>> <prog.c>
>>   #include <string.h>
>>   #include <stdio.h>
>>   #include <stdlib.h>
>>   #include <unistd.h>
>>   #include <sys/mman.h>
>>
>>   const size_t size = 1024*1024*1024;
>>
>>   int main(void)
>>   {
>>           size_t offs;
>>           char *area;
>>
>>           area = mmap(0, size, PROT_READ | PROT_WRITE,
>>                       MAP_ANON | MAP_PRIVATE, -1, 0);
>>           if (area == MAP_FAILED) {
>>                   printf("mmap failed\n");
>>                   exit(-1);
>>           }
>>           madvise(area, size, MADV_HUGEPAGE);
>>
>>           for (offs = 0; offs < size; offs += getpagesize())
>>                   area[offs] = 0;
>>           pause();
>>   }
>> <\prog.c>
>>
>> 2) Trigger the shrinker
>>
>> E.g., memory pressure through memhog
>>
>> 3) Observe that THPs are not getting reclaimed
>>
>> $ cat /proc/`pgrep prog`/smaps_rollup
>>
>> Would list ~1GiB of AnonHugePages. With this fix, they would get
>> reclaimed as expected.
>>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
>> Cc: Nico Pache <npache@redhat.com>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Dev Jain <dev.jain@arm.com>
>> Cc: Barry Song <baohua@kernel.org>
>> Cc: Usama Arif <usamaarif642@gmail.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   mm/huge_memory.c | 3 ---
>>   1 file changed, 3 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 26cedfcd74189..aa3ed7a86435b 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -4110,9 +4110,6 @@ static bool thp_underused(struct folio *folio)
>>   	void *kaddr;
>>   	int i;
>>   
>> -	if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
>> -		return false;
>> -
> 
> I do agree with your usecase, but I am really worried about the amount of
> work and cpu time the THP shrinker will consume when max_ptes_none is 511
> (I dont have any numbers to back up my worry :)), and its less likely that
> we will have these completely zeroed out THPs (again no numbers to back up
> this statement).

Then then shrinker shall be deactivated as default if that becomes a 
problem.

Fortunately you documented the desired semantics:

"All THPs at fault and collapse time will be added to _deferred_list,
and will therefore be split under memory pressure if they are considered
"underused". A THP is underused if the number of zero-filled pages in
the THP is above max_ptes_none (see below)."

> We have the huge_zero_folio as well which is installed on read.

Yes, only if the huge zero folio is not available. Which will then also 
get properly reclaimed.

-- 
Cheers

David / dhildenb



  reply	other threads:[~2025-09-05 14:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 14:11 [PATCH v1] mm/huge_memory: fix shrinking of all-zero THPs with max_ptes_none default David Hildenbrand
2025-09-05 14:37 ` Zi Yan
2025-09-05 14:39   ` David Hildenbrand
2025-09-06  6:35     ` Lance Yang
2025-09-05 14:43   ` Usama Arif
2025-09-05 14:47     ` David Hildenbrand
2025-09-05 14:58       ` Usama Arif
2025-09-05 14:40 ` Usama Arif
2025-09-05 14:46   ` David Hildenbrand [this message]
2025-09-05 14:53     ` Usama Arif
2025-09-05 14:58       ` David Hildenbrand
2025-09-05 15:01         ` Usama Arif
2025-09-05 15:04           ` David Hildenbrand
2025-09-05 15:16             ` Usama Arif
2025-09-05 15:28               ` David Hildenbrand
2025-09-05 15:53                 ` Usama Arif
2025-09-05 15:57                   ` Usama Arif
2025-09-05 15:58                   ` David Hildenbrand
2025-09-05 16:47                     ` Usama Arif
2025-09-05 16:55                       ` David Hildenbrand
2025-09-05 17:26                         ` Usama Arif
2025-09-08  9:14                           ` David Hildenbrand
2025-09-14 14:04   ` Dev Jain
2025-09-15  8:51     ` David Hildenbrand
2025-09-05 15:02 ` Usama Arif
2025-09-05 15:11 ` David Hildenbrand
2025-09-05 15:30 ` Lorenzo Stoakes
2025-09-05 15:36   ` David Hildenbrand
2025-09-08 11:32     ` Lorenzo Stoakes
2025-09-06  6:39 ` Lance Yang
2025-09-08  2:16 ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1aa5818f-eb75-4aee-a866-9d2f81111056@redhat.com \
    --to=david@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=usamaarif642@gmail.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).