From: Lorenzo Stoakes <ljs@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <liam@infradead.org>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Oscar Salvador <osalvador@suse.de>,
Hugh Dickins <hughd@google.com>,
Lance Yang <lance.yang@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Bibo Mao <maobibo@loongson.cn>,
stable@vger.kernel.org
Subject: Re: [PATCH] mm: fix __vm_normal_page() to handle missing support for pmd_special()/pud_special()
Date: Tue, 5 May 2026 13:20:40 +0100 [thread overview]
Message-ID: <afnffbs-5cy7KGua@lucifer> (raw)
In-Reply-To: <20260430-pmd_special-v1-1-dbcbcfd72c20@kernel.org>
On Thu, Apr 30, 2026 at 01:31:22PM +0200, David Hildenbrand (Arm) wrote:
> On x86 32-bit with THP enabled, zap_huge_pmd() is seen to generate a
> "WARNING: mm/memory.c:735 at __vm_normal_page+0x6a/0x7d", from the
> VM_WARN_ON_ONCE(is_zero_pfn(pfn) || is_huge_zero_pfn(pfn)); followed
> by "BUG: Bad rss-counter state"s, then later "BUG: Bad page state"s
> when reclaim gets to call shrink_huge_zero_folio_scan().
>
> It's as if the _PAGE_SPECIAL bit never got set in the huge_zero pmd:
> and indeed, whereas pte_special() and pte_mkspecial() are subject to a
> dedicated CONFIG_ARCH_HAS_PTE_SPECIAL, pmd_special() and pmd_mkspecial()
> are subject to CONFIG_ARCH_SUPPORTS_PMD_PFNMAP, which is never enabled
> on any 32-bit architecture.
Ah damn. I wonder if it's really a combination of 'supports THP' and 'has a
spare software defined bit free in PTE'?
In any case obviously have to fix this.
>
> While the problem was exposed through commit d80a9cb1a64a ("mm/huge_memory:
> add and use normal_or_softleaf_folio_pmd()"), it was an oversight in commit
> af38538801c6 ("mm/memory: factor out common code from vm_normal_page_*()")
> and would result in other problems:
> * huge zero folio accounted in smaps, pagemap (PAGE_IS_FILE) and
> numamaps as file-backed THP
> * folio_walk_start() returning the folio even without FW_ZEROPAGE set.
> Callers seem to tolerate that, though.
>
> ... and triggering the VM_WARN_ON_ONE(), although never reported so far.
>
> To fix it, teach vm_normal_page_pmd()/vm_normal_page_pud() to consider
> whether pmd_special/pud_special is actually implemented.
>
> Fixes: af38538801c6 ("mm/memory: factor out common code from vm_normal_page_*()")
> Reported-by: Hugh Dickins <hughd@google.com>
> Closes: https://lore.kernel.org/r/74a75b59-2e13-3985-ee99-d5521f39df2a@google.com
> Reported-by: Bibo Mao <maobibo@loongson.cn>
> Closes: https://lore.kernel.org/r/20260430041121.2839350-1-maobibo@loongson.cn
> Debugged-by: Hugh Dickins <hughd@google.com>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Tested-by: Bibo Mao <maobibo@loongson.cn>
> Cc: stable@vger.kernel.org
> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
This LGTM, so:
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
> ---
> This is an alternative to Hugh's patch, whereby we leave pmd_special()
> be a NOP and instead teach __vm_normal_page() about lack of support for
> pmd_special/pud_special.
> ---
> mm/memory.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 7322a40e73b9..4d84976fc7f4 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -612,6 +612,21 @@ static void print_bad_page_map(struct vm_area_struct *vma,
> dump_stack();
> add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> }
> +
> +static inline bool pgtable_level_has_pxx_special(enum pgtable_level level)
> +{
> + switch (level) {
> + case PGTABLE_LEVEL_PTE:
> + return IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL);
> + case PGTABLE_LEVEL_PMD:
> + return IS_ENABLED(CONFIG_ARCH_SUPPORTS_PMD_PFNMAP);
> + case PGTABLE_LEVEL_PUD:
> + return IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP);
> + default:
> + return false;
> + }
> +}
> +
> #define print_bad_pte(vma, addr, pte, page) \
> print_bad_page_map(vma, addr, pte_val(pte), page, PGTABLE_LEVEL_PTE)
>
> @@ -684,7 +699,7 @@ static inline struct page *__vm_normal_page(struct vm_area_struct *vma,
> unsigned long addr, unsigned long pfn, bool special,
> unsigned long long entry, enum pgtable_level level)
> {
> - if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
> + if (pgtable_level_has_pxx_special(level)) {
> if (unlikely(special)) {
> #ifdef CONFIG_FIND_NORMAL_PAGE
> if (vma->vm_ops && vma->vm_ops->find_normal_page)
> @@ -699,8 +714,9 @@ static inline struct page *__vm_normal_page(struct vm_area_struct *vma,
> return NULL;
> }
> /*
> - * With CONFIG_ARCH_HAS_PTE_SPECIAL, any special page table
> - * mappings (incl. shared zero folios) are marked accordingly.
> + * With working pte_special()/pmd_special()..., any special page
> + * table mappings (incl. shared zero folios) are marked
> + * accordingly.
> */
> } else {
> if (unlikely(vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))) {
>
> ---
>
> base-commit: d94322006a51b522dd361128a450bf9e75aad889
>
> change-id: 20260430-pmd_special-610dbdd8ac3c
>
> --
>
> Cheers,
>
> David
>
next prev parent reply other threads:[~2026-05-05 12:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 11:31 [PATCH] mm: fix __vm_normal_page() to handle missing support for pmd_special()/pud_special() David Hildenbrand (Arm)
2026-05-05 12:20 ` Lorenzo Stoakes [this message]
2026-05-05 14:36 ` David Hildenbrand (Arm)
2026-05-06 4:17 ` Oscar Salvador
2026-05-06 5:11 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afnffbs-5cy7KGua@lucifer \
--to=ljs@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hughd@google.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maobibo@loongson.cn \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.