From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>,
Usama Arif <usamaarif642@gmail.com>
Subject: [PATCH v1] mm/huge_memory: fix shrinking of all-zero THPs with max_ptes_none default
Date: Fri, 5 Sep 2025 16:11:37 +0200 [thread overview]
Message-ID: <20250905141137.3529867-1-david@redhat.com> (raw)
We added an early exit in thp_underused(), probably to avoid scanning
pages when there is no chance for success.
However, assume we have max_ptes_none = 511 (default).
Nothing should stop us from freeing all pages part of a THP that
is completely zero (512) and khugepaged will for sure not try to
instantiate a THP in that case (512 shared zeropages).
This can just trivially happen if someone writes a single 0 byte into a
PMD area, or of course, when data ends up being zero later.
So let's remove that early exit.
Do we want to CC stable? Hm, not sure. Probably not urgent.
Note that, as default, the THP shrinker is active
(/sys/kernel/mm/transparent_hugepage/shrink_underused = 1), and all
THPs are added to the deferred split lists. However, with the
max_ptes_none default we would never scan them. We would not do that. If
that's not desirable, we should just disable the shrinker as default,
also not adding all THPs to the deferred split lists.
Easy to reproduce:
1) Allocate some THPs filled with 0s
<prog.c>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
const size_t size = 1024*1024*1024;
int main(void)
{
size_t offs;
char *area;
area = mmap(0, size, PROT_READ | PROT_WRITE,
MAP_ANON | MAP_PRIVATE, -1, 0);
if (area == MAP_FAILED) {
printf("mmap failed\n");
exit(-1);
}
madvise(area, size, MADV_HUGEPAGE);
for (offs = 0; offs < size; offs += getpagesize())
area[offs] = 0;
pause();
}
<\prog.c>
2) Trigger the shrinker
E.g., memory pressure through memhog
3) Observe that THPs are not getting reclaimed
$ cat /proc/`pgrep prog`/smaps_rollup
Would list ~1GiB of AnonHugePages. With this fix, they would get
reclaimed as expected.
Fixes: dafff3f4c850 ("mm: split underused THPs")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Usama Arif <usamaarif642@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/huge_memory.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 26cedfcd74189..aa3ed7a86435b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4110,9 +4110,6 @@ static bool thp_underused(struct folio *folio)
void *kaddr;
int i;
- if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
- return false;
-
for (i = 0; i < folio_nr_pages(folio); i++) {
kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
--
2.50.1
next reply other threads:[~2025-09-05 14:11 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-05 14:11 David Hildenbrand [this message]
2025-09-05 14:37 ` [PATCH v1] mm/huge_memory: fix shrinking of all-zero THPs with max_ptes_none default Zi Yan
2025-09-05 14:39 ` David Hildenbrand
2025-09-06 6:35 ` Lance Yang
2025-09-05 14:43 ` Usama Arif
2025-09-05 14:47 ` David Hildenbrand
2025-09-05 14:58 ` Usama Arif
2025-09-05 14:40 ` Usama Arif
2025-09-05 14:46 ` David Hildenbrand
2025-09-05 14:53 ` Usama Arif
2025-09-05 14:58 ` David Hildenbrand
2025-09-05 15:01 ` Usama Arif
2025-09-05 15:04 ` David Hildenbrand
2025-09-05 15:16 ` Usama Arif
2025-09-05 15:28 ` David Hildenbrand
2025-09-05 15:53 ` Usama Arif
2025-09-05 15:57 ` Usama Arif
2025-09-05 15:58 ` David Hildenbrand
2025-09-05 16:47 ` Usama Arif
2025-09-05 16:55 ` David Hildenbrand
2025-09-05 17:26 ` Usama Arif
2025-09-08 9:14 ` David Hildenbrand
2025-09-14 14:04 ` Dev Jain
2025-09-15 8:51 ` David Hildenbrand
2025-09-05 15:02 ` Usama Arif
2025-09-05 15:11 ` David Hildenbrand
2025-09-05 15:30 ` Lorenzo Stoakes
2025-09-05 15:36 ` David Hildenbrand
2025-09-08 11:32 ` Lorenzo Stoakes
2025-09-06 6:39 ` Lance Yang
2025-09-08 2:16 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250905141137.3529867-1-david@redhat.com \
--to=david@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=dev.jain@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=usamaarif642@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).