From: Lance Yang <lance.yang@linux.dev>
To: hughd@google.com
Cc: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com,
baohua@kernel.org, david@kernel.org, dev.jain@arm.com,
lance.yang@linux.dev, liam.howlett@oracle.com, ljs@kernel.org,
mhocko@suse.com, rppt@kernel.org, npache@redhat.com,
zhengqi.arch@bytedance.com, ryan.roberts@arm.com,
surenb@google.com, ziy@nvidia.com, linux-mm@kvack.org
Subject: Re: [PATCH hotfix] mm: fix pmd_special() fallback to observe huge_zero
Date: Wed, 29 Apr 2026 13:54:47 +0800 [thread overview]
Message-ID: <20260429055447.9220-1-lance.yang@linux.dev> (raw)
In-Reply-To: <74a75b59-2e13-3985-ee99-d5521f39df2a@google.com>
On Tue, Apr 28, 2026 at 10:08:37PM -0700, Hugh Dickins wrote:
>On x86 32-bit with THP enabled, zap_huge_pmd() is seen to generate a
>"WARNING: mm/memory.c:735 at __vm_normal_page+0x6a/0x7d", from the
>VM_WARN_ON_ONCE(is_zero_pfn(pfn) || is_huge_zero_pfn(pfn)); followed
>by "BUG: Bad rss-counter state"s, then later "BUG: Bad page state"s
>when reclaim gets to call shrink_huge_zero_folio_scan().
Good catch!
>It's as if the _PAGE_SPECIAL bit never got set in the huge_zero pmd:
>and indeed, whereas pte_special() and pte_mkspecial() are subject to a
>dedicated CONFIG_ARCH_HAS_PTE_SPECIAL, pmd_special() and pmd_mkspecial()
>are subject to CONFIG_ARCH_SUPPORTS_PMD_PFNMAP, which is never enabled
>on any 32-bit architecture.
>
>Add CONFIG_ARCH_HAS_PMD_SPECIAL? Perhaps; but I think it's better just
>to observe the huge_zero pmd in the fallback version of pmd_special().
>
>Fixes: d80a9cb1a64a ("mm/huge_memory: add and use normal_or_softleaf_folio_pmd()")
>Signed-off-by: Hugh Dickins <hughd@google.com>
>---
> include/linux/mm.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/include/linux/mm.h b/include/linux/mm.h
>index 0b776907152e..3b02ac43bcb7 100644
>--- a/include/linux/mm.h
>+++ b/include/linux/mm.h
>@@ -3422,7 +3422,7 @@ static inline pte_t pte_mkspecial(pte_t pte)
> #ifndef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
> static inline bool pmd_special(pmd_t pmd)
> {
>- return false;
>+ return is_huge_zero_pmd(pmd);
> }
Emm ... feels a bit odd to me ...
On these configs pmd_mkspecial() is still a no-op, so pmd_special()
would no longer really mean that the PMD was made special :)
Could we handle the huge zero PMD in vm_normal_page_pmd() instead?
---8<---
diff --git a/mm/memory.c b/mm/memory.c
index 199214f8de08..3d9ed41a46b8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -793,6 +793,9 @@ struct folio *vm_normal_folio(struct vm_area_struct *vma, unsigned long addr,
struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
pmd_t pmd)
{
+ if (is_huge_zero_pmd(pmd))
+ return NULL;
+
return __vm_normal_page(vma, addr, pmd_pfn(pmd), pmd_special(pmd),
pmd_val(pmd), PGTABLE_LEVEL_PMD);
}
--
zap_huge_pmd()
-> normal_or_softleaf_folio_pmd()
-> vm_normal_folio_pmd()
-> vm_normal_page_pmd()
-> is_huge_zero_pmd(pmd)
-> return NULL
-> return NULL
That's where we ask whether the PMD maps a normal page, and for the huge
zero PMD the answer is simply "no".
-> has_deposited_pgtable()
-> is_huge_zero_pmd(pmd)
-> true
-> zap_deposited_table()
-> skip zap_huge_pmd_folio()
Then zap_huge_pmd() would still withdraw the deposited pgtable (except
for DAX huge zero PMDs), but skip zap_huge_pmd_folio() - no normal folio
rmap/RSS accounting.
Just a thought :)
Thanks,
Lance
next prev parent reply other threads:[~2026-04-29 5:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 5:08 [PATCH hotfix] mm: fix pmd_special() fallback to observe huge_zero Hugh Dickins
2026-04-29 5:54 ` Lance Yang [this message]
2026-04-29 6:12 ` David Hildenbrand (Arm)
2026-04-29 6:57 ` Lance Yang
2026-04-29 7:14 ` David Hildenbrand (Arm)
2026-04-29 7:33 ` Lance Yang
2026-04-30 5:53 ` David Hildenbrand (Arm)
2026-04-30 6:46 ` Lance Yang
2026-04-30 8:30 ` Lance Yang
2026-04-30 8:48 ` Hugh Dickins
2026-04-30 8:54 ` David Hildenbrand (Arm)
2026-04-30 9:10 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260429055447.9220-1-lance.yang@linux.dev \
--to=lance.yang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hughd@google.com \
--cc=liam.howlett@oracle.com \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.