linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
@ 2025-10-31 16:57 Catalin Marinas
  2025-10-31 17:15 ` David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-10-31 16:57 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-arm-kernel, David Hildenbrand, Andrew Morton, Will Deacon

On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
user space will need to have its allocation tags initialised. This is
normally done in the arm64 set_pte_at() after checking the memory
attributes. Such page is also marked with the PG_mte_tagged flag to
avoid subsequent clearing. Since this relies on having a struct page,
pte_special() mappings are ignored.

Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero
folio special") maps the huge zero folio special and the arm64
set_pmd_at() will no longer zero the tags. There is no guarantee that
the tags are zero, especially if parts of this huge page have been
previously tagged.

Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition,
do not warn in the arm64 __access_remote_tags() when reading tags from
the huge zero page.

Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
---

It's fairly easy to detect this by regularly dropping the caches to
force the reallocation of the huge zero folio. I bundled the arm64
change in here as well since they are both related to the commit mapping
the huge zero folio as special.

I don't have any preference how this patch goes in, either the mm tree
or the arm64 fixes one is fine by me.

 arch/arm64/kernel/mte.c | 3 ++-
 mm/huge_memory.c        | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 43f7a2f39403..32148bf09c1d 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -476,7 +476,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
 
 		folio = page_folio(page);
 		if (folio_test_hugetlb(folio))
-			WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio));
+			WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio) &&
+				     !is_huge_zero_folio(folio));
 		else
 			WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1b81680b4225..b7498e51282a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -214,7 +214,8 @@ static bool get_huge_zero_folio(void)
 	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
 		return true;
 
-	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
+	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
+				 ~__GFP_MOVABLE,
 			HPAGE_PMD_ORDER);
 	if (!zero_folio) {
 		count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
@ 2025-10-31 17:15 ` David Hildenbrand
  2025-11-03 13:32 ` Mark Brown
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2025-10-31 17:15 UTC (permalink / raw)
  To: Catalin Marinas, linux-mm; +Cc: linux-arm-kernel, Andrew Morton, Will Deacon

On 31.10.25 17:57, Catalin Marinas wrote:
> On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
> user space will need to have its allocation tags initialised. This is
> normally done in the arm64 set_pte_at() after checking the memory
> attributes. Such page is also marked with the PG_mte_tagged flag to
> avoid subsequent clearing. Since this relies on having a struct page,
> pte_special() mappings are ignored.
> 
> Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero
> folio special") maps the huge zero folio special and the arm64
> set_pmd_at() will no longer zero the tags. There is no guarantee that
> the tags are zero, especially if parts of this huge page have been
> previously tagged.
> 
> Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition,
> do not warn in the arm64 __access_remote_tags() when reading tags from
> the huge zero page.
> 
> Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special")
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Will Deacon <will@kernel.org>

Acked-by: David Hildenbrand <david@redhat.com>

Thanks, Catalin!

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
  2025-10-31 17:15 ` David Hildenbrand
@ 2025-11-03 13:32 ` Mark Brown
  2025-11-03 14:30   ` Catalin Marinas
  2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
  2025-11-09  0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
  3 siblings, 1 reply; 21+ messages in thread
From: Mark Brown @ 2025-11-03 13:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-mm, linux-arm-kernel, David Hildenbrand, Andrew Morton,
	Will Deacon, Aishwarya.TCV

[-- Attachment #1: Type: text/plain, Size: 4883 bytes --]

On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote:

> On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
> user space will need to have its allocation tags initialised. This is
> normally done in the arm64 set_pte_at() after checking the memory
> attributes. Such page is also marked with the PG_mte_tagged flag to
> avoid subsequent clearing. Since this relies on having a struct page,
> pte_special() mappings are ignored.

We are seeing breakage in userspace on a range of arm64 platforms which
bisects to this commit in -next.  We see traces like:

[   59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1]  SMP

...

[   59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40
[   59.830980] lr : tag_clear_highpage+0x68/0x118

...

[   59.911874] Call trace:
[   59.914333]  mte_zero_clear_page_tags+0x1c/0x40 (P)
[   59.919278]  get_page_from_freelist+0x1a60/0x1c80
[   59.924042]  __alloc_frozen_pages_noprof+0x178/0xd20
[   59.929068]  alloc_pages_mpol+0xb4/0x1a4
[   59.933022]  alloc_frozen_pages_noprof+0x48/0xc0
[   59.937683]  folio_alloc_noprof+0x14/0x68
[   59.941718]  mm_get_huge_zero_folio+0xf4/0x30c
[   59.946200]  do_huge_pmd_anonymous_page+0x278/0x6a0
[   59.951119]  __handle_mm_fault+0x700/0x1834
[   59.955332]  handle_mm_fault+0x8c/0x2a0
[   59.959190]  do_page_fault+0x108/0x75c
[   59.962964]  do_translation_fault+0x5c/0x6c
[   59.967181]  do_mem_abort+0x40/0x90

Looking at the codes:

> -       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
> +       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
> +                                ~__GFP_MOVABLE,
>                         HPAGE_PMD_ORDER);

This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks
like this was previously only enabled by vma_alloc_zeroed_movable_folio()
when the VMA has VM_MTE, I think we need a similar test here.

Full log:

   https://lava.sirena.org.uk/scheduler/job/2036941#L1423

Sample bisect log (with links to further runtime logs:

# bad: [3575af345aa2424636e69ac101a568bda249abe6] Merge branch 'i2c/i2c-host-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git
# good: [6146a0f1dfae5d37442a9ddcba012add260bceb0] Linux 6.18-rc4
git bisect start '3575af345aa2424636e69ac101a568bda249abe6' '6146a0f1dfae5d37442a9ddcba012add260bceb0'
# test job: [3575af345aa2424636e69ac101a568bda249abe6] https://lava.sirena.org.uk/scheduler/job/2036941
# bad: [3575af345aa2424636e69ac101a568bda249abe6] Merge branch 'i2c/i2c-host-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git
git bisect bad 3575af345aa2424636e69ac101a568bda249abe6
# test job: [81f07f151b0759666fee3184651aab90ef1c5ed5] https://lava.sirena.org.uk/scheduler/job/2037418
# bad: [81f07f151b0759666fee3184651aab90ef1c5ed5] Merge branch 'usb-linus' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
git bisect bad 81f07f151b0759666fee3184651aab90ef1c5ed5
# test job: [661821dc41b20c505d5adad10bb71fe5746c57fb] https://lava.sirena.org.uk/scheduler/job/2037588
# bad: [661821dc41b20c505d5adad10bb71fe5746c57fb] Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git
git bisect bad 661821dc41b20c505d5adad10bb71fe5746c57fb
# test job: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] https://lava.sirena.org.uk/scheduler/job/2037773
# bad: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] mm/huge_memory: initialise the tags of the huge zero folio
git bisect bad 972cbfc499182c200bd3d2fb8fe4173df0d4d3e7
# test job: [6923ff319e1b0d3541cd240b6ed494e1cb6287d1] https://lava.sirena.org.uk/scheduler/job/2037885
# good: [6923ff319e1b0d3541cd240b6ed494e1cb6287d1] fs/proc: fix uaf in proc_readdir_de()
git bisect good 6923ff319e1b0d3541cd240b6ed494e1cb6287d1
# test job: [e356021a7589f6dd1c946b86c933462dca0bc1ec] https://lava.sirena.org.uk/scheduler/job/2037973
# good: [e356021a7589f6dd1c946b86c933462dca0bc1ec] codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
git bisect good e356021a7589f6dd1c946b86c933462dca0bc1ec
# test job: [decb83d6239894a14b07c8f5ebf9feb5c87ec651] https://lava.sirena.org.uk/scheduler/job/2038008
# good: [decb83d6239894a14b07c8f5ebf9feb5c87ec651] mm/damon/sysfs: change next_update_jiffies to a global variable
git bisect good decb83d6239894a14b07c8f5ebf9feb5c87ec651
# test job: [53cf2716958490997b8ce11cc4665256905b23dd] https://lava.sirena.org.uk/scheduler/job/2038076
# good: [53cf2716958490997b8ce11cc4665256905b23dd] nilfs2: avoid having an active sc_timer before freeing sci
git bisect good 53cf2716958490997b8ce11cc4665256905b23dd
# first bad commit: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] mm/huge_memory: initialise the tags of the huge zero folio

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-03 13:32 ` Mark Brown
@ 2025-11-03 14:30   ` Catalin Marinas
  2025-11-03 14:41     ` David Hildenbrand (Red Hat)
  2025-11-04 11:53     ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang
  0 siblings, 2 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-11-03 14:30 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-mm, linux-arm-kernel, David Hildenbrand, Andrew Morton,
	Will Deacon, Aishwarya.TCV

On Mon, Nov 03, 2025 at 01:32:42PM +0000, Mark Brown wrote:
> On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote:
> 
> > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
> > user space will need to have its allocation tags initialised. This is
> > normally done in the arm64 set_pte_at() after checking the memory
> > attributes. Such page is also marked with the PG_mte_tagged flag to
> > avoid subsequent clearing. Since this relies on having a struct page,
> > pte_special() mappings are ignored.
> 
> We are seeing breakage in userspace on a range of arm64 platforms which
> bisects to this commit in -next.  We see traces like:
> 
> [   59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1]  SMP
> 
> ...
> 
> [   59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40
> [   59.830980] lr : tag_clear_highpage+0x68/0x118
> 
> ...
> 
> [   59.911874] Call trace:
> [   59.914333]  mte_zero_clear_page_tags+0x1c/0x40 (P)
> [   59.919278]  get_page_from_freelist+0x1a60/0x1c80
> [   59.924042]  __alloc_frozen_pages_noprof+0x178/0xd20
> [   59.929068]  alloc_pages_mpol+0xb4/0x1a4
> [   59.933022]  alloc_frozen_pages_noprof+0x48/0xc0
> [   59.937683]  folio_alloc_noprof+0x14/0x68
> [   59.941718]  mm_get_huge_zero_folio+0xf4/0x30c
> [   59.946200]  do_huge_pmd_anonymous_page+0x278/0x6a0
> [   59.951119]  __handle_mm_fault+0x700/0x1834
> [   59.955332]  handle_mm_fault+0x8c/0x2a0
> [   59.959190]  do_page_fault+0x108/0x75c
> [   59.962964]  do_translation_fault+0x5c/0x6c
> [   59.967181]  do_mem_abort+0x40/0x90

Thanks for the report. I missed the fact that the arch
mte_zero_clear_page_tags() arch code issues MTE instructions
irrespective of whether the hardware supports it. We got away with this
so far since we check the VM_MTE flag and that's only set if the
hardware supports MTE.

> Looking at the codes:
> 
> > -       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
> > +       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
> > +                                ~__GFP_MOVABLE,
> >                         HPAGE_PMD_ORDER);
> 
> This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks
> like this was previously only enabled by vma_alloc_zeroed_movable_folio()
> when the VMA has VM_MTE, I think we need a similar test here.

We can't do this for the huge zero page since this will be shared by
other vmas and not all would have VM_MTE set. I'll fix it in the arch
code:

-----------8<---------------------------------------------
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index d816ff44faff..125dfa6c613b 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
 
 void tag_clear_highpage(struct page *page)
 {
+	/*
+	 * Check if MTE is supported and fall back to clear_highpage().
+	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
+	 * post_alloc_hook() will invoke tag_clear_highpage().
+	 */
+	if (!system_supports_mte()) {
+		clear_highpage(page);
+		return;
+	}
+
 	/* Newly allocated page, shouldn't have been tagged yet */
 	WARN_ON_ONCE(!try_page_mte_tagging(page));
 	mte_zero_clear_page_tags(page_address(page));
------------------8<------------------------------------------

Testing now.

-- 
Catalin


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-03 14:30   ` Catalin Marinas
@ 2025-11-03 14:41     ` David Hildenbrand (Red Hat)
  2025-11-03 15:59       ` Catalin Marinas
  2025-11-04 11:53     ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang
  1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-03 14:41 UTC (permalink / raw)
  To: Catalin Marinas, Mark Brown
  Cc: linux-mm, linux-arm-kernel, Andrew Morton, Will Deacon,
	Aishwarya.TCV

> 
> -----------8<---------------------------------------------
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index d816ff44faff..125dfa6c613b 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>   
>   void tag_clear_highpage(struct page *page)
>   {
> +	/*
> +	 * Check if MTE is supported and fall back to clear_highpage().
> +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> +	 * post_alloc_hook() will invoke tag_clear_highpage().
> +	 */
> +	if (!system_supports_mte()) {
> +		clear_highpage(page);
> +		return;
> +	}
> +

LGTM!

-- 
Cheers

David



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-03 14:41     ` David Hildenbrand (Red Hat)
@ 2025-11-03 15:59       ` Catalin Marinas
  2025-11-03 19:29         ` Beleswar Prasad Padhi
  2025-11-04  1:05         ` Andrew Morton
  0 siblings, 2 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-11-03 15:59 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Andrew Morton
  Cc: Mark Brown, linux-mm, linux-arm-kernel, Will Deacon,
	Aishwarya.TCV

On Mon, Nov 03, 2025 at 03:41:03PM +0100, David Hildenbrand (Red Hat) wrote:
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index d816ff44faff..125dfa6c613b 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
> >   void tag_clear_highpage(struct page *page)
> >   {
> > +	/*
> > +	 * Check if MTE is supported and fall back to clear_highpage().
> > +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> > +	 * post_alloc_hook() will invoke tag_clear_highpage().
> > +	 */
> > +	if (!system_supports_mte()) {
> > +		clear_highpage(page);
> > +		return;
> > +	}
> 
> LGTM!

I tested it with and without MTE and it works fine.

Andrew, would you like a separate patch or are you ok with folding this
into the previous patch?

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-03 15:59       ` Catalin Marinas
@ 2025-11-03 19:29         ` Beleswar Prasad Padhi
  2025-11-04  1:05         ` Andrew Morton
  1 sibling, 0 replies; 21+ messages in thread
From: Beleswar Prasad Padhi @ 2025-11-03 19:29 UTC (permalink / raw)
  To: Catalin Marinas, David Hildenbrand (Red Hat), Andrew Morton
  Cc: Mark Brown, linux-mm, linux-arm-kernel, Will Deacon,
	Aishwarya.TCV


On 11/3/2025 9:29 PM, Catalin Marinas wrote:
> On Mon, Nov 03, 2025 at 03:41:03PM +0100, David Hildenbrand (Red Hat) wrote:
>>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>>> index d816ff44faff..125dfa6c613b 100644
>>> --- a/arch/arm64/mm/fault.c
>>> +++ b/arch/arm64/mm/fault.c
>>> @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>>    void tag_clear_highpage(struct page *page)
>>>    {
>>> +	/*
>>> +	 * Check if MTE is supported and fall back to clear_highpage().
>>> +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
>>> +	 * post_alloc_hook() will invoke tag_clear_highpage().
>>> +	 */
>>> +	if (!system_supports_mte()) {
>>> +		clear_highpage(page);
>>> +		return;
>>> +	}
>> LGTM!
> I tested it with and without MTE and it works fine.


I tested the above patch on ARM64 based TI J7200 EVM board.
Boots fine. Feel free to use my T/B:

Tested-by: Beleswar Padhi <b-padhi@ti.com>

Thanks,
Beleswar

>
> Andrew, would you like a separate patch or are you ok with folding this
> into the previous patch?
>
> Thanks.
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-03 15:59       ` Catalin Marinas
  2025-11-03 19:29         ` Beleswar Prasad Padhi
@ 2025-11-04  1:05         ` Andrew Morton
  2025-11-04  8:52           ` Catalin Marinas
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2025-11-04  1:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand (Red Hat), Mark Brown, linux-mm,
	linux-arm-kernel, Will Deacon, Aishwarya.TCV

On Mon, 3 Nov 2025 15:59:39 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:

> > > --- a/arch/arm64/mm/fault.c
> > > +++ b/arch/arm64/mm/fault.c
> > > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
> > >   void tag_clear_highpage(struct page *page)
> > >   {
> > > +	/*
> > > +	 * Check if MTE is supported and fall back to clear_highpage().
> > > +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> > > +	 * post_alloc_hook() will invoke tag_clear_highpage().
> > > +	 */
> > > +	if (!system_supports_mte()) {
> > > +		clear_highpage(page);
> > > +		return;
> > > +	}
> > 
> > LGTM!
> 
> I tested it with and without MTE and it works fine.
> 
> Andrew, would you like a separate patch or are you ok with folding this
> into the previous patch?

I added it as a -fix patch thanks.

And I added a Signed-off-by-you-by-me ;)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
  2025-11-04  1:05         ` Andrew Morton
@ 2025-11-04  8:52           ` Catalin Marinas
  0 siblings, 0 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-11-04  8:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand (Red Hat), Mark Brown, linux-mm,
	linux-arm-kernel, Will Deacon, Aishwarya.TCV

On Mon, Nov 03, 2025 at 05:05:47PM -0800, Andrew Morton wrote:
> On Mon, 3 Nov 2025 15:59:39 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > --- a/arch/arm64/mm/fault.c
> > > > +++ b/arch/arm64/mm/fault.c
> > > > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
> > > >   void tag_clear_highpage(struct page *page)
> > > >   {
> > > > +	/*
> > > > +	 * Check if MTE is supported and fall back to clear_highpage().
> > > > +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> > > > +	 * post_alloc_hook() will invoke tag_clear_highpage().
> > > > +	 */
> > > > +	if (!system_supports_mte()) {
> > > > +		clear_highpage(page);
> > > > +		return;
> > > > +	}
> > > 
> > > LGTM!
> > 
> > I tested it with and without MTE and it works fine.
> > 
> > Andrew, would you like a separate patch or are you ok with folding this
> > into the previous patch?
> 
> I added it as a -fix patch thanks.
> 
> And I added a Signed-off-by-you-by-me ;)

Thanks Andrew.

-- 
Catalin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero
  2025-11-03 14:30   ` Catalin Marinas
  2025-11-03 14:41     ` David Hildenbrand (Red Hat)
@ 2025-11-04 11:53     ` Lance Yang
  1 sibling, 0 replies; 21+ messages in thread
From: Lance Yang @ 2025-11-04 11:53 UTC (permalink / raw)
  To: catalin.marinas
  Cc: Aishwarya.TCV, akpm, broonie, david, linux-arm-kernel, linux-mm,
	will, Lance Yang

From: Lance Yang <lance.yang@linux.dev>


On Mon, 3 Nov 2025 14:30:12 +0000, Catalin Marinas wrote:
> On Mon, Nov 03, 2025 at 01:32:42PM +0000, Mark Brown wrote:
> > On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote:
> > 
> > > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
> > > user space will need to have its allocation tags initialised. This is
> > > normally done in the arm64 set_pte_at() after checking the memory
> > > attributes. Such page is also marked with the PG_mte_tagged flag to
> > > avoid subsequent clearing. Since this relies on having a struct page,
> > > pte_special() mappings are ignored.
> > 
> > We are seeing breakage in userspace on a range of arm64 platforms which
> > bisects to this commit in -next.  We see traces like:
> > 
> > [   59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1]  SMP
> > 
> > ...
> > 
> > [   59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40
> > [   59.830980] lr : tag_clear_highpage+0x68/0x118
> > 
> > ...
> > 
> > [   59.911874] Call trace:
> > [   59.914333]  mte_zero_clear_page_tags+0x1c/0x40 (P)
> > [   59.919278]  get_page_from_freelist+0x1a60/0x1c80
> > [   59.924042]  __alloc_frozen_pages_noprof+0x178/0xd20
> > [   59.929068]  alloc_pages_mpol+0xb4/0x1a4
> > [   59.933022]  alloc_frozen_pages_noprof+0x48/0xc0
> > [   59.937683]  folio_alloc_noprof+0x14/0x68
> > [   59.941718]  mm_get_huge_zero_folio+0xf4/0x30c
> > [   59.946200]  do_huge_pmd_anonymous_page+0x278/0x6a0
> > [   59.951119]  __handle_mm_fault+0x700/0x1834
> > [   59.955332]  handle_mm_fault+0x8c/0x2a0
> > [   59.959190]  do_page_fault+0x108/0x75c
> > [   59.962964]  do_translation_fault+0x5c/0x6c
> > [   59.967181]  do_mem_abort+0x40/0x90
> 
> Thanks for the report. I missed the fact that the arch
> mte_zero_clear_page_tags() arch code issues MTE instructions
> irrespective of whether the hardware supports it. We got away with this
> so far since we check the VM_MTE flag and that's only set if the
> hardware supports MTE.
> 
> > Looking at the codes:
> > 
> > > -       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
> > > +       zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
> > > +                                ~__GFP_MOVABLE,
> > >                         HPAGE_PMD_ORDER);
> > 
> > This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks
> > like this was previously only enabled by vma_alloc_zeroed_movable_folio()
> > when the VMA has VM_MTE, I think we need a similar test here.
> 
> We can't do this for the huge zero page since this will be shared by
> other vmas and not all would have VM_MTE set. I'll fix it in the arch
> code:
> 
> -----------8<---------------------------------------------
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index d816ff44faff..125dfa6c613b 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>  
>  void tag_clear_highpage(struct page *page)
>  {
> +	/*
> +	 * Check if MTE is supported and fall back to clear_highpage().
> +	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> +	 * post_alloc_hook() will invoke tag_clear_highpage().
> +	 */
> +	if (!system_supports_mte()) {
> +		clear_highpage(page);
> +		return;
> +	}
> +
>  	/* Newly allocated page, shouldn't have been tagged yet */
>  	WARN_ON_ONCE(!try_page_mte_tagging(page));
>  	mte_zero_clear_page_tags(page_address(page));
> ------------------8<------------------------------------------
> 
> Testing now.
> 

Good catch! LGTM, feel free to add:

Reviewed-by: Lance Yang <lance.yang@linux.dev>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] mm/huge_memory: initialise the tags of the huge zero folio
  2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
  2025-10-31 17:15 ` David Hildenbrand
  2025-11-03 13:32 ` Mark Brown
@ 2025-11-08 19:19 ` Jan Polensky
  2025-11-09  0:42   ` [PATCH] Clarification: please ignore earlier submission Jan Polensky
  2025-11-09  0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
  3 siblings, 1 reply; 21+ messages in thread
From: Jan Polensky @ 2025-11-08 19:19 UTC (permalink / raw)
  To: catalin.marinas; +Cc: akpm, david, linux-arm-kernel, linux-mm, will

From: Catalin Marinas <catalin.marinas@arm.com>

On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
user space will need to have its allocation tags initialised.  This is
normally done in the arm64 set_pte_at() after checking the memory
attributes.  Such page is also marked with the PG_mte_tagged flag to avoid
subsequent clearing.  Since this relies on having a struct page,
pte_special() mappings are ignored.

Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero
folio special") maps the huge zero folio special and the arm64
set_pmd_at() will no longer zero the tags.  There is no guarantee that the
tags are zero, especially if parts of this huge page have been previously
tagged.

It's fairly easy to detect this by regularly dropping the caches to
force the reallocation of the huge zero folio.

Allocate the huge zero folio with the __GFP_ZEROTAGS flag.  In addition,
do not warn in the arm64 __access_remote_tags() when reading tags from the
huge zero page.

I bundled the arm64 change in here as well since they are both related to
the commit mapping the huge zero folio as special.

Link: https://lkml.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com
Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jan Polensky <japo@linux.ibm.com>
---
 arch/arm64/kernel/mte.c | 3 ++-
 mm/huge_memory.c        | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 43f7a2f39403..32148bf09c1d 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -476,7 +476,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
 
 		folio = page_folio(page);
 		if (folio_test_hugetlb(folio))
-			WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio));
+			WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio) &&
+				     !is_huge_zero_folio(folio));
 		else
 			WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b4ff49d96501..323654fb4f8c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -214,7 +214,8 @@ static bool get_huge_zero_folio(void)
 	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
 		return true;
 
-	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
+	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
+				 ~__GFP_MOVABLE,
 			HPAGE_PMD_ORDER);
 	if (!zero_folio) {
 		count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
                   ` (2 preceding siblings ...)
  2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
@ 2025-11-09  0:36 ` Jan Polensky
  2025-11-10  9:09   ` David Hildenbrand (Red Hat)
  3 siblings, 1 reply; 21+ messages in thread
From: Jan Polensky @ 2025-11-09  0:36 UTC (permalink / raw)
  To: catalin.marinas; +Cc: akpm, david, linux-arm-kernel, linux-mm, will

The previous change added __GFP_ZEROTAGS when allocating the huge zero
folio to ensure tag initialization for arm64 with MTE enabled. However,
on s390 this flag is unnecessary and triggers a regression
(observed as a crash during repeated 'dnf makecache').

Restrict the use of __GFP_ZEROTAGS to architectures that support
hardware memory tagging (currently arm64 with MTE or KASAN HW tags).
This avoids unintended side effects on other platforms.

Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio")
Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com
Signed-off-by: Jan Polensky <japo@linux.ibm.com>
---
 mm/huge_memory.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index aae283b00857..0c1794656d7a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,

 static bool get_huge_zero_folio(void)
 {
+	gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE;
 	struct folio *zero_folio;
 retry:
 	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
 		return true;
-
-	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
-				 ~__GFP_MOVABLE,
-			HPAGE_PMD_ORDER);
+#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE)
+	gfp |= __GFP_ZEROTAGS;
+#endif
+	zero_folio = folio_alloc(gfp, HPAGE_PMD_ORDER);
 	if (!zero_folio) {
 		count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);
 		return false;
--
2.48.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH] Clarification: please ignore earlier submission
  2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
@ 2025-11-09  0:42   ` Jan Polensky
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Polensky @ 2025-11-09  0:42 UTC (permalink / raw)
  To: japo; +Cc: akpm, catalin.marinas, david, linux-arm-kernel, linux-mm, will

Hi all,

Apologies for the confusion, the patch titled
"[PATCH] mm/huge_memory: initialise the tags of the huge zero folio"
was accidentally sent.

The correct version is:
https://lore.kernel.org/all/20251109003613.1461433-1-japo@linux.ibm.com/

Thanks,
Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-09  0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
@ 2025-11-10  9:09   ` David Hildenbrand (Red Hat)
  2025-11-10  9:48     ` Jan Polensky
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-10  9:09 UTC (permalink / raw)
  To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will

On 09.11.25 01:36, Jan Polensky wrote:
> The previous change added __GFP_ZEROTAGS when allocating the huge zero
> folio to ensure tag initialization for arm64 with MTE enabled. However,
> on s390 this flag is unnecessary and triggers a regression
> (observed as a crash during repeated 'dnf makecache').
> 
> Restrict the use of __GFP_ZEROTAGS to architectures that support
> hardware memory tagging (currently arm64 with MTE or KASAN HW tags).
> This avoids unintended side effects on other platforms.
> 
> Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio")
> Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com
> Signed-off-by: Jan Polensky <japo@linux.ibm.com>
> ---
>   mm/huge_memory.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index aae283b00857..0c1794656d7a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> 
>   static bool get_huge_zero_folio(void)
>   {
> +	gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE;
>   	struct folio *zero_folio;
>   retry:
>   	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
>   		return true;
> -
> -	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
> -				 ~__GFP_MOVABLE,
> -			HPAGE_PMD_ORDER);
> +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE)
> +	gfp |= __GFP_ZEROTAGS;
> +#endif

That looks like the wrong approach. If an architecture does not support
__GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored.

I think the problem is that post_alloc_hook() does

if (zero_tags) {
	/* Initialize both memory and memory tags. */
	for (i = 0; i != 1 << order; ++i)
		tag_clear_highpage(page + i);

	/* Take note that memory was initialized by the loop above. */
	init = false;
}

And tag_clear_highpage() is a NOP on other architectures.

Gah.

I wonder if the following would work:


diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index 65db9349f9053..56b82e116cb79 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -47,7 +47,9 @@ enum {
         ___GFP_HARDWALL_BIT,
         ___GFP_THISNODE_BIT,
         ___GFP_ACCOUNT_BIT,
+#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
         ___GFP_ZEROTAGS_BIT,
+#endif
  #ifdef CONFIG_KASAN_HW_TAGS
         ___GFP_SKIP_ZERO_BIT,
         ___GFP_SKIP_KASAN_BIT,
@@ -85,7 +87,11 @@ enum {
  #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
  #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
  #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
+#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
  #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
+#else
+#define ___GFP_ZEROTAGS                0
+#endif
  #ifdef CONFIG_KASAN_HW_TAGS
  #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
  #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)


Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
kconfig option.


Then we could turn the default implementation of
tag_clear_highpage() into a BUILD_BUG.



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-10  9:09   ` David Hildenbrand (Red Hat)
@ 2025-11-10  9:48     ` Jan Polensky
  2025-11-10  9:53       ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Polensky @ 2025-11-10  9:48 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), catalin.marinas
  Cc: akpm, linux-arm-kernel, linux-mm, will

On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
> On 09.11.25 01:36, Jan Polensky wrote:
> > The previous change added __GFP_ZEROTAGS when allocating the huge zero
> > folio to ensure tag initialization for arm64 with MTE enabled. However,
> > on s390 this flag is unnecessary and triggers a regression
> > (observed as a crash during repeated 'dnf makecache').
> >
> > Restrict the use of __GFP_ZEROTAGS to architectures that support
> > hardware memory tagging (currently arm64 with MTE or KASAN HW tags).
> > This avoids unintended side effects on other platforms.
> >
> > Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio")
> > Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com
> > Signed-off-by: Jan Polensky <japo@linux.ibm.com>
> > ---
> >   mm/huge_memory.c | 9 +++++----
> >   1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index aae283b00857..0c1794656d7a 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> >
> >   static bool get_huge_zero_folio(void)
> >   {
> > +	gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE;
> >   	struct folio *zero_folio;
> >   retry:
> >   	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
> >   		return true;
> > -
> > -	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
> > -				 ~__GFP_MOVABLE,
> > -			HPAGE_PMD_ORDER);
> > +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE)
> > +	gfp |= __GFP_ZEROTAGS;
> > +#endif
>
> That looks like the wrong approach. If an architecture does not support
> __GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored.
>
> I think the problem is that post_alloc_hook() does
>
> if (zero_tags) {
> 	/* Initialize both memory and memory tags. */
> 	for (i = 0; i != 1 << order; ++i)
> 		tag_clear_highpage(page + i);
>
> 	/* Take note that memory was initialized by the loop above. */
> 	init = false;
> }
>
> And tag_clear_highpage() is a NOP on other architectures.
>
> Gah.
>
> I wonder if the following would work:
>
>
> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> index 65db9349f9053..56b82e116cb79 100644
> --- a/include/linux/gfp_types.h
> +++ b/include/linux/gfp_types.h
> @@ -47,7 +47,9 @@ enum {
>         ___GFP_HARDWALL_BIT,
>         ___GFP_THISNODE_BIT,
>         ___GFP_ACCOUNT_BIT,
> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>         ___GFP_ZEROTAGS_BIT,
> +#endif
>  #ifdef CONFIG_KASAN_HW_TAGS
>         ___GFP_SKIP_ZERO_BIT,
>         ___GFP_SKIP_KASAN_BIT,
> @@ -85,7 +87,11 @@ enum {
>  #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
>  #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
>  #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>  #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
> +#else
> +#define ___GFP_ZEROTAGS                0
> +#endif
>  #ifdef CONFIG_KASAN_HW_TAGS
>  #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
>  #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)
>
>
> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
> kconfig option.
>
>
> Then we could turn the default implementation of
> tag_clear_highpage() into a BUILD_BUG.
>
I'd like to suggest to keep the enum untouched and only use the second
part of your suggestion.
Which works by the way for our arch (s390).

 include/linux/gfp_types.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index 65db9349f905..c12d8a601bb3 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -85,7 +85,11 @@ enum {
 #define ___GFP_HARDWALL        BIT(___GFP_HARDWALL_BIT)
 #define ___GFP_THISNODE        BIT(___GFP_THISNODE_BIT)
 #define ___GFP_ACCOUNT     BIT(___GFP_ACCOUNT_BIT)
+#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
 #define ___GFP_ZEROTAGS        BIT(___GFP_ZEROTAGS_BIT)
+#else
+#define ___GFP_ZEROTAGS        0
+#endif
 #ifdef CONFIG_KASAN_HW_TAGS
 #define ___GFP_SKIP_ZERO   BIT(___GFP_SKIP_ZERO_BIT)
 #define ___GFP_SKIP_KASAN  BIT(___GFP_SKIP_KASAN_BIT)

This solution would be sufficient from my side, and I would appreciate a
quick application if there are no objections.

Thank you David.


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-10  9:48     ` Jan Polensky
@ 2025-11-10  9:53       ` David Hildenbrand (Red Hat)
  2025-11-10 15:28         ` Catalin Marinas
  2025-11-11 10:44         ` Jan Polensky
  0 siblings, 2 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-10  9:53 UTC (permalink / raw)
  To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will

On 10.11.25 10:48, Jan Polensky wrote:
> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 09.11.25 01:36, Jan Polensky wrote:
>>> The previous change added __GFP_ZEROTAGS when allocating the huge zero
>>> folio to ensure tag initialization for arm64 with MTE enabled. However,
>>> on s390 this flag is unnecessary and triggers a regression
>>> (observed as a crash during repeated 'dnf makecache').
>>>
>>> Restrict the use of __GFP_ZEROTAGS to architectures that support
>>> hardware memory tagging (currently arm64 with MTE or KASAN HW tags).
>>> This avoids unintended side effects on other platforms.
>>>
>>> Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio")
>>> Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com
>>> Signed-off-by: Jan Polensky <japo@linux.ibm.com>
>>> ---
>>>    mm/huge_memory.c | 9 +++++----
>>>    1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index aae283b00857..0c1794656d7a 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>>
>>>    static bool get_huge_zero_folio(void)
>>>    {
>>> +	gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE;
>>>    	struct folio *zero_folio;
>>>    retry:
>>>    	if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
>>>    		return true;
>>> -
>>> -	zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
>>> -				 ~__GFP_MOVABLE,
>>> -			HPAGE_PMD_ORDER);
>>> +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE)
>>> +	gfp |= __GFP_ZEROTAGS;
>>> +#endif
>>
>> That looks like the wrong approach. If an architecture does not support
>> __GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored.
>>
>> I think the problem is that post_alloc_hook() does
>>
>> if (zero_tags) {
>> 	/* Initialize both memory and memory tags. */
>> 	for (i = 0; i != 1 << order; ++i)
>> 		tag_clear_highpage(page + i);
>>
>> 	/* Take note that memory was initialized by the loop above. */
>> 	init = false;
>> }
>>
>> And tag_clear_highpage() is a NOP on other architectures.
>>
>> Gah.
>>
>> I wonder if the following would work:
>>
>>
>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>> index 65db9349f9053..56b82e116cb79 100644
>> --- a/include/linux/gfp_types.h
>> +++ b/include/linux/gfp_types.h
>> @@ -47,7 +47,9 @@ enum {
>>          ___GFP_HARDWALL_BIT,
>>          ___GFP_THISNODE_BIT,
>>          ___GFP_ACCOUNT_BIT,
>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>          ___GFP_ZEROTAGS_BIT,
>> +#endif
>>   #ifdef CONFIG_KASAN_HW_TAGS
>>          ___GFP_SKIP_ZERO_BIT,
>>          ___GFP_SKIP_KASAN_BIT,
>> @@ -85,7 +87,11 @@ enum {
>>   #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
>>   #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
>>   #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>   #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
>> +#else
>> +#define ___GFP_ZEROTAGS                0
>> +#endif
>>   #ifdef CONFIG_KASAN_HW_TAGS
>>   #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
>>   #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)
>>
>>
>> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
>> kconfig option.
>>
>>
>> Then we could turn the default implementation of
>> tag_clear_highpage() into a BUILD_BUG.
>>
> I'd like to suggest to keep the enum untouched and only use the second
> part of your suggestion.

Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and 
CONFIG_SLAB_OBJ_EXT.

> Which works by the way for our arch (s390).
> 
>   include/linux/gfp_types.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> index 65db9349f905..c12d8a601bb3 100644
> --- a/include/linux/gfp_types.h
> +++ b/include/linux/gfp_types.h
> @@ -85,7 +85,11 @@ enum {
>   #define ___GFP_HARDWALL        BIT(___GFP_HARDWALL_BIT)
>   #define ___GFP_THISNODE        BIT(___GFP_THISNODE_BIT)
>   #define ___GFP_ACCOUNT     BIT(___GFP_ACCOUNT_BIT)
> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>   #define ___GFP_ZEROTAGS        BIT(___GFP_ZEROTAGS_BIT)
> +#else
> +#define ___GFP_ZEROTAGS        0
> +#endif
>   #ifdef CONFIG_KASAN_HW_TAGS
>   #define ___GFP_SKIP_ZERO   BIT(___GFP_SKIP_ZERO_BIT)
>   #define ___GFP_SKIP_KASAN  BIT(___GFP_SKIP_KASAN_BIT)
> 
> This solution would be sufficient from my side, and I would appreciate a
> quick application if there are no objections.

As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen 
early in that file, it should likely become a CONFIG_ thing.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-10  9:53       ` David Hildenbrand (Red Hat)
@ 2025-11-10 15:28         ` Catalin Marinas
  2025-11-10 15:55           ` Catalin Marinas
  2025-11-11 10:44         ` Jan Polensky
  1 sibling, 1 reply; 21+ messages in thread
From: Catalin Marinas @ 2025-11-10 15:28 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Jan Polensky, akpm, linux-arm-kernel, linux-mm, will

On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
> On 10.11.25 10:48, Jan Polensky wrote:
> > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
> > > On 09.11.25 01:36, Jan Polensky wrote:
> > > > The previous change added __GFP_ZEROTAGS when allocating the huge zero
> > > > folio to ensure tag initialization for arm64 with MTE enabled. However,
> > > > on s390 this flag is unnecessary and triggers a regression
> > > > (observed as a crash during repeated 'dnf makecache').
[...]
> > > I think the problem is that post_alloc_hook() does
> > > 
> > > if (zero_tags) {
> > > 	/* Initialize both memory and memory tags. */
> > > 	for (i = 0; i != 1 << order; ++i)
> > > 		tag_clear_highpage(page + i);
> > > 
> > > 	/* Take note that memory was initialized by the loop above. */
> > > 	init = false;
> > > }
> > > 
> > > And tag_clear_highpage() is a NOP on other architectures.

Hmm, another thing I missed. Sorry about this.

> > Which works by the way for our arch (s390).
> > 
> >   include/linux/gfp_types.h | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> > index 65db9349f905..c12d8a601bb3 100644
> > --- a/include/linux/gfp_types.h
> > +++ b/include/linux/gfp_types.h
> > @@ -85,7 +85,11 @@ enum {
> >   #define ___GFP_HARDWALL        BIT(___GFP_HARDWALL_BIT)
> >   #define ___GFP_THISNODE        BIT(___GFP_THISNODE_BIT)
> >   #define ___GFP_ACCOUNT     BIT(___GFP_ACCOUNT_BIT)
> > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> >   #define ___GFP_ZEROTAGS        BIT(___GFP_ZEROTAGS_BIT)
> > +#else
> > +#define ___GFP_ZEROTAGS        0
> > +#endif
> >   #ifdef CONFIG_KASAN_HW_TAGS
> >   #define ___GFP_SKIP_ZERO   BIT(___GFP_SKIP_ZERO_BIT)
> >   #define ___GFP_SKIP_KASAN  BIT(___GFP_SKIP_KASAN_BIT)
> > 
> > This solution would be sufficient from my side, and I would appreciate a
> > quick application if there are no objections.
> 
> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
> early in that file, it should likely become a CONFIG_ thing.

I'm fine with either option above but I'll throw one more in the mix:

--------------------8<--------------------------------
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 2312e6ee595f..dcff91533590 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
 						unsigned long vaddr);
 #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
 
+bool arch_has_tag_clear_highpage(void);
 void tag_clear_highpage(struct page *to);
 #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
 
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 125dfa6c613b..318d091db843 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
 	return vma_alloc_folio(flags, 0, vma, vaddr);
 }
 
+bool arch_has_tag_clear_highpage(void)
+{
+	return system_supports_mte();
+}
+
 void tag_clear_highpage(struct page *page)
 {
-	/*
-	 * Check if MTE is supported and fall back to clear_highpage().
-	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
-	 * post_alloc_hook() will invoke tag_clear_highpage().
-	 */
-	if (!system_supports_mte()) {
-		clear_highpage(page);
-		return;
-	}
-
 	/* Newly allocated page, shouldn't have been tagged yet */
 	WARN_ON_ONCE(!try_page_mte_tagging(page));
 	mte_zero_clear_page_tags(page_address(page));
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 105cc4c00cc3..7aa56179ccef 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
 
 #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
 
+static inline bool arch_has_tag_clear_highpage(void)
+{
+	return false;
+}
+
 static inline void tag_clear_highpage(struct page *page)
 {
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e4efda1158b2..5ab15431bc06 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 {
 	bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
 			!should_skip_init(gfp_flags);
-	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
+	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
+		arch_has_tag_clear_highpage();
 	int i;
 
 	set_page_private(page, 0);
--------------------8<--------------------------------

Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
kernel which are also exposed to user because the tags are shared (same
physical location). The 'zero_tags' initialisation in post_alloc_hook()
makes sense for this behaviour. With virtual tagging (briefly announced
in [1], full specs not public yet), both the user and the kernel can
have their own tags - more like KASAN_SW_TAGS but without the compiler
instrumentation. The kernel won't be able to zero the tags for the user
since they are in virtual space. It can, however, continue to use Kasan
tags even if the pages are mapped in user space. In this case, I'd
rather use the kernel_init_pages() call further down in
post_alloc_hook() than replicating it in tag_clear_highpage(). When we
get to upstreaming virtual tagging (informally vMTE, sometime next
year), I'd like to have a kernel image that supports both, so the
decision on whether to call tag_clear_highpage() will need to be
dynamic.

[1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte

-- 
Catalin


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-10 15:28         ` Catalin Marinas
@ 2025-11-10 15:55           ` Catalin Marinas
  0 siblings, 0 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-11-10 15:55 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Jan Polensky, akpm, linux-arm-kernel, linux-mm, will

On Mon, Nov 10, 2025 at 03:28:16PM +0000, Catalin Marinas wrote:
> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
> > On 10.11.25 10:48, Jan Polensky wrote:
> > > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
> > > > On 09.11.25 01:36, Jan Polensky wrote:
> > > > > The previous change added __GFP_ZEROTAGS when allocating the huge zero
> > > > > folio to ensure tag initialization for arm64 with MTE enabled. However,
> > > > > on s390 this flag is unnecessary and triggers a regression
> > > > > (observed as a crash during repeated 'dnf makecache').
> [...]
> > > > I think the problem is that post_alloc_hook() does
> > > > 
> > > > if (zero_tags) {
> > > > 	/* Initialize both memory and memory tags. */
> > > > 	for (i = 0; i != 1 << order; ++i)
> > > > 		tag_clear_highpage(page + i);
> > > > 
> > > > 	/* Take note that memory was initialized by the loop above. */
> > > > 	init = false;
> > > > }
> > > > 
> > > > And tag_clear_highpage() is a NOP on other architectures.
[...]
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 2312e6ee595f..dcff91533590 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>  						unsigned long vaddr);
>  #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>  
> +bool arch_has_tag_clear_highpage(void);
>  void tag_clear_highpage(struct page *to);
>  #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>  
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 125dfa6c613b..318d091db843 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>  	return vma_alloc_folio(flags, 0, vma, vaddr);
>  }
>  
> +bool arch_has_tag_clear_highpage(void)
> +{
> +	return system_supports_mte();
> +}
> +
>  void tag_clear_highpage(struct page *page)
>  {
> -	/*
> -	 * Check if MTE is supported and fall back to clear_highpage().
> -	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> -	 * post_alloc_hook() will invoke tag_clear_highpage().
> -	 */
> -	if (!system_supports_mte()) {
> -		clear_highpage(page);
> -		return;
> -	}
> -
>  	/* Newly allocated page, shouldn't have been tagged yet */
>  	WARN_ON_ONCE(!try_page_mte_tagging(page));
>  	mte_zero_clear_page_tags(page_address(page));
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 105cc4c00cc3..7aa56179ccef 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>  
>  #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>  
> +static inline bool arch_has_tag_clear_highpage(void)
> +{
> +	return false;
> +}
> +
>  static inline void tag_clear_highpage(struct page *page)
>  {
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4efda1158b2..5ab15431bc06 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>  {
>  	bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
>  			!should_skip_init(gfp_flags);
> -	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
> +	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
> +		arch_has_tag_clear_highpage();
>  	int i;
>  
>  	set_page_private(page, 0);
> --------------------8<--------------------------------
> 
> Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
> kernel which are also exposed to user because the tags are shared (same
> physical location). The 'zero_tags' initialisation in post_alloc_hook()
> makes sense for this behaviour. With virtual tagging (briefly announced
> in [1], full specs not public yet), both the user and the kernel can
> have their own tags - more like KASAN_SW_TAGS but without the compiler
> instrumentation. The kernel won't be able to zero the tags for the user
> since they are in virtual space. It can, however, continue to use Kasan
> tags even if the pages are mapped in user space. In this case, I'd
> rather use the kernel_init_pages() call further down in
> post_alloc_hook() than replicating it in tag_clear_highpage(). When we
> get to upstreaming virtual tagging (informally vMTE, sometime next
> year), I'd like to have a kernel image that supports both, so the
> decision on whether to call tag_clear_highpage() will need to be
> dynamic.

Actually, there's not much to kernel_init_pages() other than disabling
kasan temporarily since the unpoisoning already took place a few lines
up. The arm64 tag_clear_highpage() calling clear_highpage() directly is
fine before unpoisoning. So we can cope with this even in the vMTE case.

A simple patch hiding the enum is fine by me.

-- 
Catalin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-10  9:53       ` David Hildenbrand (Red Hat)
  2025-11-10 15:28         ` Catalin Marinas
@ 2025-11-11 10:44         ` Jan Polensky
  2025-11-11 12:27           ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Polensky @ 2025-11-11 10:44 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), catalin.marinas
  Cc: akpm, linux-arm-kernel, linux-mm, will

On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
> On 10.11.25 10:48, Jan Polensky wrote:
> > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
> > > On 09.11.25 01:36, Jan Polensky wrote:
---8<--- snip ---8<---
> > > I wonder if the following would work:
> > >
> > >
> > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> > > index 65db9349f9053..56b82e116cb79 100644
> > > --- a/include/linux/gfp_types.h
> > > +++ b/include/linux/gfp_types.h
> > > @@ -47,7 +47,9 @@ enum {
> > >          ___GFP_HARDWALL_BIT,
> > >          ___GFP_THISNODE_BIT,
> > >          ___GFP_ACCOUNT_BIT,
> > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> > >          ___GFP_ZEROTAGS_BIT,
> > > +#endif
> > >   #ifdef CONFIG_KASAN_HW_TAGS
> > >          ___GFP_SKIP_ZERO_BIT,
> > >          ___GFP_SKIP_KASAN_BIT,
> > > @@ -85,7 +87,11 @@ enum {
> > >   #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
> > >   #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
> > >   #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
> > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> > >   #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
> > > +#else
> > > +#define ___GFP_ZEROTAGS                0
> > > +#endif
> > >   #ifdef CONFIG_KASAN_HW_TAGS
> > >   #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
> > >   #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)
> > >
> > >
> > > Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
> > > kconfig option.
> > >
> > >
> > > Then we could turn the default implementation of
> > > tag_clear_highpage() into a BUILD_BUG.
> > >
> > I'd like to suggest to keep the enum untouched and only use the second
> > part of your suggestion.
>
> Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and
> CONFIG_SLAB_OBJ_EXT.
If we remove the enum entry, we’d also need to update mmflags.h because
the trace macros reference it.
Enums are compile-time only, so they don’t affect the generated binary.
My thought was to keep the enum list as it is and just apply the second
part of your suggestion.
That way, the trace definitions stay consistent without extra changes.
Just an idea, happy to go with whatever you prefer.
>
> > Which works by the way for our arch (s390).
> >
> >   include/linux/gfp_types.h | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
> > index 65db9349f905..c12d8a601bb3 100644
> > --- a/include/linux/gfp_types.h
> > +++ b/include/linux/gfp_types.h
> > @@ -85,7 +85,11 @@ enum {
> >   #define ___GFP_HARDWALL        BIT(___GFP_HARDWALL_BIT)
> >   #define ___GFP_THISNODE        BIT(___GFP_THISNODE_BIT)
> >   #define ___GFP_ACCOUNT     BIT(___GFP_ACCOUNT_BIT)
> > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> >   #define ___GFP_ZEROTAGS        BIT(___GFP_ZEROTAGS_BIT)
> > +#else
> > +#define ___GFP_ZEROTAGS        0
> > +#endif
> >   #ifdef CONFIG_KASAN_HW_TAGS
> >   #define ___GFP_SKIP_ZERO   BIT(___GFP_SKIP_ZERO_BIT)
> >   #define ___GFP_SKIP_KASAN  BIT(___GFP_SKIP_KASAN_BIT)
> >
> > This solution would be sufficient from my side, and I would appreciate a
> > quick application if there are no objections.
>
> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
> early in that file, it should likely become a CONFIG_ thing.
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-11 10:44         ` Jan Polensky
@ 2025-11-11 12:27           ` David Hildenbrand (Red Hat)
  2025-11-11 12:28             ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-11 12:27 UTC (permalink / raw)
  To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will

On 11.11.25 11:44, Jan Polensky wrote:
> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 10.11.25 10:48, Jan Polensky wrote:
>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>> On 09.11.25 01:36, Jan Polensky wrote:
> ---8<--- snip ---8<---
>>>> I wonder if the following would work:
>>>>
>>>>
>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>>> index 65db9349f9053..56b82e116cb79 100644
>>>> --- a/include/linux/gfp_types.h
>>>> +++ b/include/linux/gfp_types.h
>>>> @@ -47,7 +47,9 @@ enum {
>>>>           ___GFP_HARDWALL_BIT,
>>>>           ___GFP_THISNODE_BIT,
>>>>           ___GFP_ACCOUNT_BIT,
>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>>           ___GFP_ZEROTAGS_BIT,
>>>> +#endif
>>>>    #ifdef CONFIG_KASAN_HW_TAGS
>>>>           ___GFP_SKIP_ZERO_BIT,
>>>>           ___GFP_SKIP_KASAN_BIT,
>>>> @@ -85,7 +87,11 @@ enum {
>>>>    #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
>>>>    #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
>>>>    #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>>    #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
>>>> +#else
>>>> +#define ___GFP_ZEROTAGS                0
>>>> +#endif
>>>>    #ifdef CONFIG_KASAN_HW_TAGS
>>>>    #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
>>>>    #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)
>>>>
>>>>
>>>> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
>>>> kconfig option.
>>>>
>>>>
>>>> Then we could turn the default implementation of
>>>> tag_clear_highpage() into a BUILD_BUG.
>>>>
>>> I'd like to suggest to keep the enum untouched and only use the second
>>> part of your suggestion.
>>
>> Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and
>> CONFIG_SLAB_OBJ_EXT.
> If we remove the enum entry, we’d also need to update mmflags.h because
> the trace macros reference it.
> Enums are compile-time only, so they don’t affect the generated binary.
> My thought was to keep the enum list as it is and just apply the second
> part of your suggestion.
> That way, the trace definitions stay consistent without extra changes.
> Just an idea, happy to go with whatever you prefer.

I think we'd remove the enum value as well, because then there is no way 
it could accidentally be reused.

And yes, as you correctly state we'll have to update mmflags as well 
like we did for CONFIG_KASAN_HW_TAGS etc.
-- 
Cheers

David



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
  2025-11-11 12:27           ` David Hildenbrand (Red Hat)
@ 2025-11-11 12:28             ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-11 12:28 UTC (permalink / raw)
  To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will

On 11.11.25 13:27, David Hildenbrand (Red Hat) wrote:
> On 11.11.25 11:44, Jan Polensky wrote:
>> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 10:48, Jan Polensky wrote:
>>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>>> On 09.11.25 01:36, Jan Polensky wrote:
>> ---8<--- snip ---8<---
>>>>> I wonder if the following would work:
>>>>>
>>>>>
>>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>>>> index 65db9349f9053..56b82e116cb79 100644
>>>>> --- a/include/linux/gfp_types.h
>>>>> +++ b/include/linux/gfp_types.h
>>>>> @@ -47,7 +47,9 @@ enum {
>>>>>            ___GFP_HARDWALL_BIT,
>>>>>            ___GFP_THISNODE_BIT,
>>>>>            ___GFP_ACCOUNT_BIT,
>>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>>>            ___GFP_ZEROTAGS_BIT,
>>>>> +#endif
>>>>>     #ifdef CONFIG_KASAN_HW_TAGS
>>>>>            ___GFP_SKIP_ZERO_BIT,
>>>>>            ___GFP_SKIP_KASAN_BIT,
>>>>> @@ -85,7 +87,11 @@ enum {
>>>>>     #define ___GFP_HARDWALL                BIT(___GFP_HARDWALL_BIT)
>>>>>     #define ___GFP_THISNODE                BIT(___GFP_THISNODE_BIT)
>>>>>     #define ___GFP_ACCOUNT         BIT(___GFP_ACCOUNT_BIT)
>>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>>>     #define ___GFP_ZEROTAGS                BIT(___GFP_ZEROTAGS_BIT)
>>>>> +#else
>>>>> +#define ___GFP_ZEROTAGS                0
>>>>> +#endif
>>>>>     #ifdef CONFIG_KASAN_HW_TAGS
>>>>>     #define ___GFP_SKIP_ZERO       BIT(___GFP_SKIP_ZERO_BIT)
>>>>>     #define ___GFP_SKIP_KASAN      BIT(___GFP_SKIP_KASAN_BIT)
>>>>>
>>>>>
>>>>> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper
>>>>> kconfig option.
>>>>>
>>>>>
>>>>> Then we could turn the default implementation of
>>>>> tag_clear_highpage() into a BUILD_BUG.
>>>>>
>>>> I'd like to suggest to keep the enum untouched and only use the second
>>>> part of your suggestion.
>>>
>>> Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and
>>> CONFIG_SLAB_OBJ_EXT.
>> If we remove the enum entry, we’d also need to update mmflags.h because
>> the trace macros reference it.
>> Enums are compile-time only, so they don’t affect the generated binary.
>> My thought was to keep the enum list as it is and just apply the second
>> part of your suggestion.
>> That way, the trace definitions stay consistent without extra changes.
>> Just an idea, happy to go with whatever you prefer.
> 
> I think we'd remove the enum value as well, because then there is no way
> it could accidentally be reused.
> 
> And yes, as you correctly state we'll have to update mmflags as well
> like we did for CONFIG_KASAN_HW_TAGS etc.

/me realizing that my mail client decided to use yet another mail alias, 
I hope I have it fixed now such that everything is sent from my 
kernel.org account ...

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-11-11 12:28 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
2025-10-31 17:15 ` David Hildenbrand
2025-11-03 13:32 ` Mark Brown
2025-11-03 14:30   ` Catalin Marinas
2025-11-03 14:41     ` David Hildenbrand (Red Hat)
2025-11-03 15:59       ` Catalin Marinas
2025-11-03 19:29         ` Beleswar Prasad Padhi
2025-11-04  1:05         ` Andrew Morton
2025-11-04  8:52           ` Catalin Marinas
2025-11-04 11:53     ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang
2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
2025-11-09  0:42   ` [PATCH] Clarification: please ignore earlier submission Jan Polensky
2025-11-09  0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
2025-11-10  9:09   ` David Hildenbrand (Red Hat)
2025-11-10  9:48     ` Jan Polensky
2025-11-10  9:53       ` David Hildenbrand (Red Hat)
2025-11-10 15:28         ` Catalin Marinas
2025-11-10 15:55           ` Catalin Marinas
2025-11-11 10:44         ` Jan Polensky
2025-11-11 12:27           ` David Hildenbrand (Red Hat)
2025-11-11 12:28             ` David Hildenbrand (Red Hat)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).