* [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio
@ 2025-10-31 16:57 Catalin Marinas
2025-10-31 17:15 ` David Hildenbrand
` (3 more replies)
0 siblings, 4 replies; 21+ messages in thread
From: Catalin Marinas @ 2025-10-31 16:57 UTC (permalink / raw)
To: linux-mm; +Cc: linux-arm-kernel, David Hildenbrand, Andrew Morton, Will Deacon
On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
user space will need to have its allocation tags initialised. This is
normally done in the arm64 set_pte_at() after checking the memory
attributes. Such page is also marked with the PG_mte_tagged flag to
avoid subsequent clearing. Since this relies on having a struct page,
pte_special() mappings are ignored.
Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero
folio special") maps the huge zero folio special and the arm64
set_pmd_at() will no longer zero the tags. There is no guarantee that
the tags are zero, especially if parts of this huge page have been
previously tagged.
Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition,
do not warn in the arm64 __access_remote_tags() when reading tags from
the huge zero page.
Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
---
It's fairly easy to detect this by regularly dropping the caches to
force the reallocation of the huge zero folio. I bundled the arm64
change in here as well since they are both related to the commit mapping
the huge zero folio as special.
I don't have any preference how this patch goes in, either the mm tree
or the arm64 fixes one is fine by me.
arch/arm64/kernel/mte.c | 3 ++-
mm/huge_memory.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 43f7a2f39403..32148bf09c1d 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -476,7 +476,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
folio = page_folio(page);
if (folio_test_hugetlb(folio))
- WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio));
+ WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio) &&
+ !is_huge_zero_folio(folio));
else
WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1b81680b4225..b7498e51282a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -214,7 +214,8 @@ static bool get_huge_zero_folio(void)
if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
return true;
- zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,
+ zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
+ ~__GFP_MOVABLE,
HPAGE_PMD_ORDER);
if (!zero_folio) {
count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas @ 2025-10-31 17:15 ` David Hildenbrand 2025-11-03 13:32 ` Mark Brown ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: David Hildenbrand @ 2025-10-31 17:15 UTC (permalink / raw) To: Catalin Marinas, linux-mm; +Cc: linux-arm-kernel, Andrew Morton, Will Deacon On 31.10.25 17:57, Catalin Marinas wrote: > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in > user space will need to have its allocation tags initialised. This is > normally done in the arm64 set_pte_at() after checking the memory > attributes. Such page is also marked with the PG_mte_tagged flag to > avoid subsequent clearing. Since this relies on having a struct page, > pte_special() mappings are ignored. > > Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero > folio special") maps the huge zero folio special and the arm64 > set_pmd_at() will no longer zero the tags. There is no guarantee that > the tags are zero, especially if parts of this huge page have been > previously tagged. > > Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition, > do not warn in the arm64 __access_remote_tags() when reading tags from > the huge zero page. > > Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special") > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Will Deacon <will@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Thanks, Catalin! -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas 2025-10-31 17:15 ` David Hildenbrand @ 2025-11-03 13:32 ` Mark Brown 2025-11-03 14:30 ` Catalin Marinas 2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky 2025-11-09 0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky 3 siblings, 1 reply; 21+ messages in thread From: Mark Brown @ 2025-11-03 13:32 UTC (permalink / raw) To: Catalin Marinas Cc: linux-mm, linux-arm-kernel, David Hildenbrand, Andrew Morton, Will Deacon, Aishwarya.TCV [-- Attachment #1: Type: text/plain, Size: 4883 bytes --] On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote: > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in > user space will need to have its allocation tags initialised. This is > normally done in the arm64 set_pte_at() after checking the memory > attributes. Such page is also marked with the PG_mte_tagged flag to > avoid subsequent clearing. Since this relies on having a struct page, > pte_special() mappings are ignored. We are seeing breakage in userspace on a range of arm64 platforms which bisects to this commit in -next. We see traces like: [ 59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP ... [ 59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40 [ 59.830980] lr : tag_clear_highpage+0x68/0x118 ... [ 59.911874] Call trace: [ 59.914333] mte_zero_clear_page_tags+0x1c/0x40 (P) [ 59.919278] get_page_from_freelist+0x1a60/0x1c80 [ 59.924042] __alloc_frozen_pages_noprof+0x178/0xd20 [ 59.929068] alloc_pages_mpol+0xb4/0x1a4 [ 59.933022] alloc_frozen_pages_noprof+0x48/0xc0 [ 59.937683] folio_alloc_noprof+0x14/0x68 [ 59.941718] mm_get_huge_zero_folio+0xf4/0x30c [ 59.946200] do_huge_pmd_anonymous_page+0x278/0x6a0 [ 59.951119] __handle_mm_fault+0x700/0x1834 [ 59.955332] handle_mm_fault+0x8c/0x2a0 [ 59.959190] do_page_fault+0x108/0x75c [ 59.962964] do_translation_fault+0x5c/0x6c [ 59.967181] do_mem_abort+0x40/0x90 Looking at the codes: > - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, > + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & > + ~__GFP_MOVABLE, > HPAGE_PMD_ORDER); This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks like this was previously only enabled by vma_alloc_zeroed_movable_folio() when the VMA has VM_MTE, I think we need a similar test here. Full log: https://lava.sirena.org.uk/scheduler/job/2036941#L1423 Sample bisect log (with links to further runtime logs: # bad: [3575af345aa2424636e69ac101a568bda249abe6] Merge branch 'i2c/i2c-host-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git # good: [6146a0f1dfae5d37442a9ddcba012add260bceb0] Linux 6.18-rc4 git bisect start '3575af345aa2424636e69ac101a568bda249abe6' '6146a0f1dfae5d37442a9ddcba012add260bceb0' # test job: [3575af345aa2424636e69ac101a568bda249abe6] https://lava.sirena.org.uk/scheduler/job/2036941 # bad: [3575af345aa2424636e69ac101a568bda249abe6] Merge branch 'i2c/i2c-host-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git git bisect bad 3575af345aa2424636e69ac101a568bda249abe6 # test job: [81f07f151b0759666fee3184651aab90ef1c5ed5] https://lava.sirena.org.uk/scheduler/job/2037418 # bad: [81f07f151b0759666fee3184651aab90ef1c5ed5] Merge branch 'usb-linus' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git git bisect bad 81f07f151b0759666fee3184651aab90ef1c5ed5 # test job: [661821dc41b20c505d5adad10bb71fe5746c57fb] https://lava.sirena.org.uk/scheduler/job/2037588 # bad: [661821dc41b20c505d5adad10bb71fe5746c57fb] Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git git bisect bad 661821dc41b20c505d5adad10bb71fe5746c57fb # test job: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] https://lava.sirena.org.uk/scheduler/job/2037773 # bad: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] mm/huge_memory: initialise the tags of the huge zero folio git bisect bad 972cbfc499182c200bd3d2fb8fe4173df0d4d3e7 # test job: [6923ff319e1b0d3541cd240b6ed494e1cb6287d1] https://lava.sirena.org.uk/scheduler/job/2037885 # good: [6923ff319e1b0d3541cd240b6ed494e1cb6287d1] fs/proc: fix uaf in proc_readdir_de() git bisect good 6923ff319e1b0d3541cd240b6ed494e1cb6287d1 # test job: [e356021a7589f6dd1c946b86c933462dca0bc1ec] https://lava.sirena.org.uk/scheduler/job/2037973 # good: [e356021a7589f6dd1c946b86c933462dca0bc1ec] codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext git bisect good e356021a7589f6dd1c946b86c933462dca0bc1ec # test job: [decb83d6239894a14b07c8f5ebf9feb5c87ec651] https://lava.sirena.org.uk/scheduler/job/2038008 # good: [decb83d6239894a14b07c8f5ebf9feb5c87ec651] mm/damon/sysfs: change next_update_jiffies to a global variable git bisect good decb83d6239894a14b07c8f5ebf9feb5c87ec651 # test job: [53cf2716958490997b8ce11cc4665256905b23dd] https://lava.sirena.org.uk/scheduler/job/2038076 # good: [53cf2716958490997b8ce11cc4665256905b23dd] nilfs2: avoid having an active sc_timer before freeing sci git bisect good 53cf2716958490997b8ce11cc4665256905b23dd # first bad commit: [972cbfc499182c200bd3d2fb8fe4173df0d4d3e7] mm/huge_memory: initialise the tags of the huge zero folio [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-03 13:32 ` Mark Brown @ 2025-11-03 14:30 ` Catalin Marinas 2025-11-03 14:41 ` David Hildenbrand (Red Hat) 2025-11-04 11:53 ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang 0 siblings, 2 replies; 21+ messages in thread From: Catalin Marinas @ 2025-11-03 14:30 UTC (permalink / raw) To: Mark Brown Cc: linux-mm, linux-arm-kernel, David Hildenbrand, Andrew Morton, Will Deacon, Aishwarya.TCV On Mon, Nov 03, 2025 at 01:32:42PM +0000, Mark Brown wrote: > On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote: > > > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in > > user space will need to have its allocation tags initialised. This is > > normally done in the arm64 set_pte_at() after checking the memory > > attributes. Such page is also marked with the PG_mte_tagged flag to > > avoid subsequent clearing. Since this relies on having a struct page, > > pte_special() mappings are ignored. > > We are seeing breakage in userspace on a range of arm64 platforms which > bisects to this commit in -next. We see traces like: > > [ 59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP > > ... > > [ 59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40 > [ 59.830980] lr : tag_clear_highpage+0x68/0x118 > > ... > > [ 59.911874] Call trace: > [ 59.914333] mte_zero_clear_page_tags+0x1c/0x40 (P) > [ 59.919278] get_page_from_freelist+0x1a60/0x1c80 > [ 59.924042] __alloc_frozen_pages_noprof+0x178/0xd20 > [ 59.929068] alloc_pages_mpol+0xb4/0x1a4 > [ 59.933022] alloc_frozen_pages_noprof+0x48/0xc0 > [ 59.937683] folio_alloc_noprof+0x14/0x68 > [ 59.941718] mm_get_huge_zero_folio+0xf4/0x30c > [ 59.946200] do_huge_pmd_anonymous_page+0x278/0x6a0 > [ 59.951119] __handle_mm_fault+0x700/0x1834 > [ 59.955332] handle_mm_fault+0x8c/0x2a0 > [ 59.959190] do_page_fault+0x108/0x75c > [ 59.962964] do_translation_fault+0x5c/0x6c > [ 59.967181] do_mem_abort+0x40/0x90 Thanks for the report. I missed the fact that the arch mte_zero_clear_page_tags() arch code issues MTE instructions irrespective of whether the hardware supports it. We got away with this so far since we check the VM_MTE flag and that's only set if the hardware supports MTE. > Looking at the codes: > > > - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, > > + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & > > + ~__GFP_MOVABLE, > > HPAGE_PMD_ORDER); > > This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks > like this was previously only enabled by vma_alloc_zeroed_movable_folio() > when the VMA has VM_MTE, I think we need a similar test here. We can't do this for the huge zero page since this will be shared by other vmas and not all would have VM_MTE set. I'll fix it in the arch code: -----------8<--------------------------------------------- diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index d816ff44faff..125dfa6c613b 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, void tag_clear_highpage(struct page *page) { + /* + * Check if MTE is supported and fall back to clear_highpage(). + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and + * post_alloc_hook() will invoke tag_clear_highpage(). + */ + if (!system_supports_mte()) { + clear_highpage(page); + return; + } + /* Newly allocated page, shouldn't have been tagged yet */ WARN_ON_ONCE(!try_page_mte_tagging(page)); mte_zero_clear_page_tags(page_address(page)); ------------------8<------------------------------------------ Testing now. -- Catalin ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-03 14:30 ` Catalin Marinas @ 2025-11-03 14:41 ` David Hildenbrand (Red Hat) 2025-11-03 15:59 ` Catalin Marinas 2025-11-04 11:53 ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang 1 sibling, 1 reply; 21+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-03 14:41 UTC (permalink / raw) To: Catalin Marinas, Mark Brown Cc: linux-mm, linux-arm-kernel, Andrew Morton, Will Deacon, Aishwarya.TCV > > -----------8<--------------------------------------------- > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index d816ff44faff..125dfa6c613b 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > > void tag_clear_highpage(struct page *page) > { > + /* > + * Check if MTE is supported and fall back to clear_highpage(). > + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > + * post_alloc_hook() will invoke tag_clear_highpage(). > + */ > + if (!system_supports_mte()) { > + clear_highpage(page); > + return; > + } > + LGTM! -- Cheers David ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-03 14:41 ` David Hildenbrand (Red Hat) @ 2025-11-03 15:59 ` Catalin Marinas 2025-11-03 19:29 ` Beleswar Prasad Padhi 2025-11-04 1:05 ` Andrew Morton 0 siblings, 2 replies; 21+ messages in thread From: Catalin Marinas @ 2025-11-03 15:59 UTC (permalink / raw) To: David Hildenbrand (Red Hat), Andrew Morton Cc: Mark Brown, linux-mm, linux-arm-kernel, Will Deacon, Aishwarya.TCV On Mon, Nov 03, 2025 at 03:41:03PM +0100, David Hildenbrand (Red Hat) wrote: > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > > index d816ff44faff..125dfa6c613b 100644 > > --- a/arch/arm64/mm/fault.c > > +++ b/arch/arm64/mm/fault.c > > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > > void tag_clear_highpage(struct page *page) > > { > > + /* > > + * Check if MTE is supported and fall back to clear_highpage(). > > + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > > + * post_alloc_hook() will invoke tag_clear_highpage(). > > + */ > > + if (!system_supports_mte()) { > > + clear_highpage(page); > > + return; > > + } > > LGTM! I tested it with and without MTE and it works fine. Andrew, would you like a separate patch or are you ok with folding this into the previous patch? Thanks. -- Catalin ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-03 15:59 ` Catalin Marinas @ 2025-11-03 19:29 ` Beleswar Prasad Padhi 2025-11-04 1:05 ` Andrew Morton 1 sibling, 0 replies; 21+ messages in thread From: Beleswar Prasad Padhi @ 2025-11-03 19:29 UTC (permalink / raw) To: Catalin Marinas, David Hildenbrand (Red Hat), Andrew Morton Cc: Mark Brown, linux-mm, linux-arm-kernel, Will Deacon, Aishwarya.TCV On 11/3/2025 9:29 PM, Catalin Marinas wrote: > On Mon, Nov 03, 2025 at 03:41:03PM +0100, David Hildenbrand (Red Hat) wrote: >>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c >>> index d816ff44faff..125dfa6c613b 100644 >>> --- a/arch/arm64/mm/fault.c >>> +++ b/arch/arm64/mm/fault.c >>> @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, >>> void tag_clear_highpage(struct page *page) >>> { >>> + /* >>> + * Check if MTE is supported and fall back to clear_highpage(). >>> + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and >>> + * post_alloc_hook() will invoke tag_clear_highpage(). >>> + */ >>> + if (!system_supports_mte()) { >>> + clear_highpage(page); >>> + return; >>> + } >> LGTM! > I tested it with and without MTE and it works fine. I tested the above patch on ARM64 based TI J7200 EVM board. Boots fine. Feel free to use my T/B: Tested-by: Beleswar Padhi <b-padhi@ti.com> Thanks, Beleswar > > Andrew, would you like a separate patch or are you ok with folding this > into the previous patch? > > Thanks. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-03 15:59 ` Catalin Marinas 2025-11-03 19:29 ` Beleswar Prasad Padhi @ 2025-11-04 1:05 ` Andrew Morton 2025-11-04 8:52 ` Catalin Marinas 1 sibling, 1 reply; 21+ messages in thread From: Andrew Morton @ 2025-11-04 1:05 UTC (permalink / raw) To: Catalin Marinas Cc: David Hildenbrand (Red Hat), Mark Brown, linux-mm, linux-arm-kernel, Will Deacon, Aishwarya.TCV On Mon, 3 Nov 2025 15:59:39 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote: > > > --- a/arch/arm64/mm/fault.c > > > +++ b/arch/arm64/mm/fault.c > > > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > > > void tag_clear_highpage(struct page *page) > > > { > > > + /* > > > + * Check if MTE is supported and fall back to clear_highpage(). > > > + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > > > + * post_alloc_hook() will invoke tag_clear_highpage(). > > > + */ > > > + if (!system_supports_mte()) { > > > + clear_highpage(page); > > > + return; > > > + } > > > > LGTM! > > I tested it with and without MTE and it works fine. > > Andrew, would you like a separate patch or are you ok with folding this > into the previous patch? I added it as a -fix patch thanks. And I added a Signed-off-by-you-by-me ;) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio 2025-11-04 1:05 ` Andrew Morton @ 2025-11-04 8:52 ` Catalin Marinas 0 siblings, 0 replies; 21+ messages in thread From: Catalin Marinas @ 2025-11-04 8:52 UTC (permalink / raw) To: Andrew Morton Cc: David Hildenbrand (Red Hat), Mark Brown, linux-mm, linux-arm-kernel, Will Deacon, Aishwarya.TCV On Mon, Nov 03, 2025 at 05:05:47PM -0800, Andrew Morton wrote: > On Mon, 3 Nov 2025 15:59:39 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote: > > > > --- a/arch/arm64/mm/fault.c > > > > +++ b/arch/arm64/mm/fault.c > > > > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > > > > void tag_clear_highpage(struct page *page) > > > > { > > > > + /* > > > > + * Check if MTE is supported and fall back to clear_highpage(). > > > > + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > > > > + * post_alloc_hook() will invoke tag_clear_highpage(). > > > > + */ > > > > + if (!system_supports_mte()) { > > > > + clear_highpage(page); > > > > + return; > > > > + } > > > > > > LGTM! > > > > I tested it with and without MTE and it works fine. > > > > Andrew, would you like a separate patch or are you ok with folding this > > into the previous patch? > > I added it as a -fix patch thanks. > > And I added a Signed-off-by-you-by-me ;) Thanks Andrew. -- Catalin ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: Initialise the tags of the huge zero 2025-11-03 14:30 ` Catalin Marinas 2025-11-03 14:41 ` David Hildenbrand (Red Hat) @ 2025-11-04 11:53 ` Lance Yang 1 sibling, 0 replies; 21+ messages in thread From: Lance Yang @ 2025-11-04 11:53 UTC (permalink / raw) To: catalin.marinas Cc: Aishwarya.TCV, akpm, broonie, david, linux-arm-kernel, linux-mm, will, Lance Yang From: Lance Yang <lance.yang@linux.dev> On Mon, 3 Nov 2025 14:30:12 +0000, Catalin Marinas wrote: > On Mon, Nov 03, 2025 at 01:32:42PM +0000, Mark Brown wrote: > > On Fri, Oct 31, 2025 at 04:57:50PM +0000, Catalin Marinas wrote: > > > > > On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in > > > user space will need to have its allocation tags initialised. This is > > > normally done in the arm64 set_pte_at() after checking the memory > > > attributes. Such page is also marked with the PG_mte_tagged flag to > > > avoid subsequent clearing. Since this relies on having a struct page, > > > pte_special() mappings are ignored. > > > > We are seeing breakage in userspace on a range of arm64 platforms which > > bisects to this commit in -next. We see traces like: > > > > [ 59.746701] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP > > > > ... > > > > [ 59.819007] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 59.826055] pc : mte_zero_clear_page_tags+0x1c/0x40 > > [ 59.830980] lr : tag_clear_highpage+0x68/0x118 > > > > ... > > > > [ 59.911874] Call trace: > > [ 59.914333] mte_zero_clear_page_tags+0x1c/0x40 (P) > > [ 59.919278] get_page_from_freelist+0x1a60/0x1c80 > > [ 59.924042] __alloc_frozen_pages_noprof+0x178/0xd20 > > [ 59.929068] alloc_pages_mpol+0xb4/0x1a4 > > [ 59.933022] alloc_frozen_pages_noprof+0x48/0xc0 > > [ 59.937683] folio_alloc_noprof+0x14/0x68 > > [ 59.941718] mm_get_huge_zero_folio+0xf4/0x30c > > [ 59.946200] do_huge_pmd_anonymous_page+0x278/0x6a0 > > [ 59.951119] __handle_mm_fault+0x700/0x1834 > > [ 59.955332] handle_mm_fault+0x8c/0x2a0 > > [ 59.959190] do_page_fault+0x108/0x75c > > [ 59.962964] do_translation_fault+0x5c/0x6c > > [ 59.967181] do_mem_abort+0x40/0x90 > > Thanks for the report. I missed the fact that the arch > mte_zero_clear_page_tags() arch code issues MTE instructions > irrespective of whether the hardware supports it. We got away with this > so far since we check the VM_MTE flag and that's only set if the > hardware supports MTE. > > > Looking at the codes: > > > > > - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, > > > + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & > > > + ~__GFP_MOVABLE, > > > HPAGE_PMD_ORDER); > > > > This adds an unonditional __GFP_ZEROTAGS - from a quick scan it looks > > like this was previously only enabled by vma_alloc_zeroed_movable_folio() > > when the VMA has VM_MTE, I think we need a similar test here. > > We can't do this for the huge zero page since this will be shared by > other vmas and not all would have VM_MTE set. I'll fix it in the arch > code: > > -----------8<--------------------------------------------- > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index d816ff44faff..125dfa6c613b 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -969,6 +969,16 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > > void tag_clear_highpage(struct page *page) > { > + /* > + * Check if MTE is supported and fall back to clear_highpage(). > + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > + * post_alloc_hook() will invoke tag_clear_highpage(). > + */ > + if (!system_supports_mte()) { > + clear_highpage(page); > + return; > + } > + > /* Newly allocated page, shouldn't have been tagged yet */ > WARN_ON_ONCE(!try_page_mte_tagging(page)); > mte_zero_clear_page_tags(page_address(page)); > ------------------8<------------------------------------------ > > Testing now. > Good catch! LGTM, feel free to add: Reviewed-by: Lance Yang <lance.yang@linux.dev> ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] mm/huge_memory: initialise the tags of the huge zero folio 2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas 2025-10-31 17:15 ` David Hildenbrand 2025-11-03 13:32 ` Mark Brown @ 2025-11-08 19:19 ` Jan Polensky 2025-11-09 0:42 ` [PATCH] Clarification: please ignore earlier submission Jan Polensky 2025-11-09 0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky 3 siblings, 1 reply; 21+ messages in thread From: Jan Polensky @ 2025-11-08 19:19 UTC (permalink / raw) To: catalin.marinas; +Cc: akpm, david, linux-arm-kernel, linux-mm, will From: Catalin Marinas <catalin.marinas@arm.com> On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in user space will need to have its allocation tags initialised. This is normally done in the arm64 set_pte_at() after checking the memory attributes. Such page is also marked with the PG_mte_tagged flag to avoid subsequent clearing. Since this relies on having a struct page, pte_special() mappings are ignored. Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special") maps the huge zero folio special and the arm64 set_pmd_at() will no longer zero the tags. There is no guarantee that the tags are zero, especially if parts of this huge page have been previously tagged. It's fairly easy to detect this by regularly dropping the caches to force the reallocation of the huge zero folio. Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition, do not warn in the arm64 __access_remote_tags() when reading tags from the huge zero page. I bundled the arm64 change in here as well since they are both related to the commit mapping the huge zero folio as special. Link: https://lkml.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lance Yang <lance.yang@linux.dev> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jan Polensky <japo@linux.ibm.com> --- arch/arm64/kernel/mte.c | 3 ++- mm/huge_memory.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 43f7a2f39403..32148bf09c1d 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -476,7 +476,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, folio = page_folio(page); if (folio_test_hugetlb(folio)) - WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio)); + WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio) && + !is_huge_zero_folio(folio)); else WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page)); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b4ff49d96501..323654fb4f8c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -214,7 +214,8 @@ static bool get_huge_zero_folio(void) if (likely(atomic_inc_not_zero(&huge_zero_refcount))) return true; - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & + ~__GFP_MOVABLE, HPAGE_PMD_ORDER); if (!zero_folio) { count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); -- 2.48.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH] Clarification: please ignore earlier submission 2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky @ 2025-11-09 0:42 ` Jan Polensky 0 siblings, 0 replies; 21+ messages in thread From: Jan Polensky @ 2025-11-09 0:42 UTC (permalink / raw) To: japo; +Cc: akpm, catalin.marinas, david, linux-arm-kernel, linux-mm, will Hi all, Apologies for the confusion, the patch titled "[PATCH] mm/huge_memory: initialise the tags of the huge zero folio" was accidentally sent. The correct version is: https://lore.kernel.org/all/20251109003613.1461433-1-japo@linux.ibm.com/ Thanks, Jan ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas ` (2 preceding siblings ...) 2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky @ 2025-11-09 0:36 ` Jan Polensky 2025-11-10 9:09 ` David Hildenbrand (Red Hat) 3 siblings, 1 reply; 21+ messages in thread From: Jan Polensky @ 2025-11-09 0:36 UTC (permalink / raw) To: catalin.marinas; +Cc: akpm, david, linux-arm-kernel, linux-mm, will The previous change added __GFP_ZEROTAGS when allocating the huge zero folio to ensure tag initialization for arm64 with MTE enabled. However, on s390 this flag is unnecessary and triggers a regression (observed as a crash during repeated 'dnf makecache'). Restrict the use of __GFP_ZEROTAGS to architectures that support hardware memory tagging (currently arm64 with MTE or KASAN HW tags). This avoids unintended side effects on other platforms. Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio") Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com Signed-off-by: Jan Polensky <japo@linux.ibm.com> --- mm/huge_memory.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index aae283b00857..0c1794656d7a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, static bool get_huge_zero_folio(void) { + gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE; struct folio *zero_folio; retry: if (likely(atomic_inc_not_zero(&huge_zero_refcount))) return true; - - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & - ~__GFP_MOVABLE, - HPAGE_PMD_ORDER); +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE) + gfp |= __GFP_ZEROTAGS; +#endif + zero_folio = folio_alloc(gfp, HPAGE_PMD_ORDER); if (!zero_folio) { count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); return false; -- 2.48.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-09 0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky @ 2025-11-10 9:09 ` David Hildenbrand (Red Hat) 2025-11-10 9:48 ` Jan Polensky 0 siblings, 1 reply; 21+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-10 9:09 UTC (permalink / raw) To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will On 09.11.25 01:36, Jan Polensky wrote: > The previous change added __GFP_ZEROTAGS when allocating the huge zero > folio to ensure tag initialization for arm64 with MTE enabled. However, > on s390 this flag is unnecessary and triggers a regression > (observed as a crash during repeated 'dnf makecache'). > > Restrict the use of __GFP_ZEROTAGS to architectures that support > hardware memory tagging (currently arm64 with MTE or KASAN HW tags). > This avoids unintended side effects on other platforms. > > Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio") > Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com > Signed-off-by: Jan Polensky <japo@linux.ibm.com> > --- > mm/huge_memory.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index aae283b00857..0c1794656d7a 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > > static bool get_huge_zero_folio(void) > { > + gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE; > struct folio *zero_folio; > retry: > if (likely(atomic_inc_not_zero(&huge_zero_refcount))) > return true; > - > - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & > - ~__GFP_MOVABLE, > - HPAGE_PMD_ORDER); > +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE) > + gfp |= __GFP_ZEROTAGS; > +#endif That looks like the wrong approach. If an architecture does not support __GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored. I think the problem is that post_alloc_hook() does if (zero_tags) { /* Initialize both memory and memory tags. */ for (i = 0; i != 1 << order; ++i) tag_clear_highpage(page + i); /* Take note that memory was initialized by the loop above. */ init = false; } And tag_clear_highpage() is a NOP on other architectures. Gah. I wonder if the following would work: diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 65db9349f9053..56b82e116cb79 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -47,7 +47,9 @@ enum { ___GFP_HARDWALL_BIT, ___GFP_THISNODE_BIT, ___GFP_ACCOUNT_BIT, +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE ___GFP_ZEROTAGS_BIT, +#endif #ifdef CONFIG_KASAN_HW_TAGS ___GFP_SKIP_ZERO_BIT, ___GFP_SKIP_KASAN_BIT, @@ -85,7 +87,11 @@ enum { #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) +#else +#define ___GFP_ZEROTAGS 0 +#endif #ifdef CONFIG_KASAN_HW_TAGS #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper kconfig option. Then we could turn the default implementation of tag_clear_highpage() into a BUILD_BUG. ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-10 9:09 ` David Hildenbrand (Red Hat) @ 2025-11-10 9:48 ` Jan Polensky 2025-11-10 9:53 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 21+ messages in thread From: Jan Polensky @ 2025-11-10 9:48 UTC (permalink / raw) To: David Hildenbrand (Red Hat), catalin.marinas Cc: akpm, linux-arm-kernel, linux-mm, will On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: > On 09.11.25 01:36, Jan Polensky wrote: > > The previous change added __GFP_ZEROTAGS when allocating the huge zero > > folio to ensure tag initialization for arm64 with MTE enabled. However, > > on s390 this flag is unnecessary and triggers a regression > > (observed as a crash during repeated 'dnf makecache'). > > > > Restrict the use of __GFP_ZEROTAGS to architectures that support > > hardware memory tagging (currently arm64 with MTE or KASAN HW tags). > > This avoids unintended side effects on other platforms. > > > > Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio") > > Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com > > Signed-off-by: Jan Polensky <japo@linux.ibm.com> > > --- > > mm/huge_memory.c | 9 +++++---- > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index aae283b00857..0c1794656d7a 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > > > > static bool get_huge_zero_folio(void) > > { > > + gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE; > > struct folio *zero_folio; > > retry: > > if (likely(atomic_inc_not_zero(&huge_zero_refcount))) > > return true; > > - > > - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & > > - ~__GFP_MOVABLE, > > - HPAGE_PMD_ORDER); > > +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE) > > + gfp |= __GFP_ZEROTAGS; > > +#endif > > That looks like the wrong approach. If an architecture does not support > __GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored. > > I think the problem is that post_alloc_hook() does > > if (zero_tags) { > /* Initialize both memory and memory tags. */ > for (i = 0; i != 1 << order; ++i) > tag_clear_highpage(page + i); > > /* Take note that memory was initialized by the loop above. */ > init = false; > } > > And tag_clear_highpage() is a NOP on other architectures. > > Gah. > > I wonder if the following would work: > > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h > index 65db9349f9053..56b82e116cb79 100644 > --- a/include/linux/gfp_types.h > +++ b/include/linux/gfp_types.h > @@ -47,7 +47,9 @@ enum { > ___GFP_HARDWALL_BIT, > ___GFP_THISNODE_BIT, > ___GFP_ACCOUNT_BIT, > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > ___GFP_ZEROTAGS_BIT, > +#endif > #ifdef CONFIG_KASAN_HW_TAGS > ___GFP_SKIP_ZERO_BIT, > ___GFP_SKIP_KASAN_BIT, > @@ -85,7 +87,11 @@ enum { > #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) > #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) > #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) > +#else > +#define ___GFP_ZEROTAGS 0 > +#endif > #ifdef CONFIG_KASAN_HW_TAGS > #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) > #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) > > > Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper > kconfig option. > > > Then we could turn the default implementation of > tag_clear_highpage() into a BUILD_BUG. > I'd like to suggest to keep the enum untouched and only use the second part of your suggestion. Which works by the way for our arch (s390). include/linux/gfp_types.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 65db9349f905..c12d8a601bb3 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -85,7 +85,11 @@ enum { #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) +#else +#define ___GFP_ZEROTAGS 0 +#endif #ifdef CONFIG_KASAN_HW_TAGS #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) This solution would be sufficient from my side, and I would appreciate a quick application if there are no objections. Thank you David. ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-10 9:48 ` Jan Polensky @ 2025-11-10 9:53 ` David Hildenbrand (Red Hat) 2025-11-10 15:28 ` Catalin Marinas 2025-11-11 10:44 ` Jan Polensky 0 siblings, 2 replies; 21+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-10 9:53 UTC (permalink / raw) To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will On 10.11.25 10:48, Jan Polensky wrote: > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: >> On 09.11.25 01:36, Jan Polensky wrote: >>> The previous change added __GFP_ZEROTAGS when allocating the huge zero >>> folio to ensure tag initialization for arm64 with MTE enabled. However, >>> on s390 this flag is unnecessary and triggers a regression >>> (observed as a crash during repeated 'dnf makecache'). >>> >>> Restrict the use of __GFP_ZEROTAGS to architectures that support >>> hardware memory tagging (currently arm64 with MTE or KASAN HW tags). >>> This avoids unintended side effects on other platforms. >>> >>> Fixes: 1579227fe0f0 ("mm/huge_memory: initialise the tags of the huge zero folio") >>> Link: https://lore.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com >>> Signed-off-by: Jan Polensky <japo@linux.ibm.com> >>> --- >>> mm/huge_memory.c | 9 +++++---- >>> 1 file changed, 5 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index aae283b00857..0c1794656d7a 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -209,14 +209,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >>> >>> static bool get_huge_zero_folio(void) >>> { >>> + gfp_t gfp = (GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE; >>> struct folio *zero_folio; >>> retry: >>> if (likely(atomic_inc_not_zero(&huge_zero_refcount))) >>> return true; >>> - >>> - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & >>> - ~__GFP_MOVABLE, >>> - HPAGE_PMD_ORDER); >>> +#if IS_ENABLED(CONFIG_KASAN_HW_TAGS) || IS_ENABLED(CONFIG_ARM64_MTE) >>> + gfp |= __GFP_ZEROTAGS; >>> +#endif >> >> That looks like the wrong approach. If an architecture does not support >> __GFP_ZEROTAGS it should not trigger anything. __GFP_ZEROTAGS should be ignored. >> >> I think the problem is that post_alloc_hook() does >> >> if (zero_tags) { >> /* Initialize both memory and memory tags. */ >> for (i = 0; i != 1 << order; ++i) >> tag_clear_highpage(page + i); >> >> /* Take note that memory was initialized by the loop above. */ >> init = false; >> } >> >> And tag_clear_highpage() is a NOP on other architectures. >> >> Gah. >> >> I wonder if the following would work: >> >> >> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h >> index 65db9349f9053..56b82e116cb79 100644 >> --- a/include/linux/gfp_types.h >> +++ b/include/linux/gfp_types.h >> @@ -47,7 +47,9 @@ enum { >> ___GFP_HARDWALL_BIT, >> ___GFP_THISNODE_BIT, >> ___GFP_ACCOUNT_BIT, >> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >> ___GFP_ZEROTAGS_BIT, >> +#endif >> #ifdef CONFIG_KASAN_HW_TAGS >> ___GFP_SKIP_ZERO_BIT, >> ___GFP_SKIP_KASAN_BIT, >> @@ -85,7 +87,11 @@ enum { >> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) >> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) >> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) >> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) >> +#else >> +#define ___GFP_ZEROTAGS 0 >> +#endif >> #ifdef CONFIG_KASAN_HW_TAGS >> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) >> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) >> >> >> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper >> kconfig option. >> >> >> Then we could turn the default implementation of >> tag_clear_highpage() into a BUILD_BUG. >> > I'd like to suggest to keep the enum untouched and only use the second > part of your suggestion. Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and CONFIG_SLAB_OBJ_EXT. > Which works by the way for our arch (s390). > > include/linux/gfp_types.h | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h > index 65db9349f905..c12d8a601bb3 100644 > --- a/include/linux/gfp_types.h > +++ b/include/linux/gfp_types.h > @@ -85,7 +85,11 @@ enum { > #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) > #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) > #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) > +#else > +#define ___GFP_ZEROTAGS 0 > +#endif > #ifdef CONFIG_KASAN_HW_TAGS > #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) > #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) > > This solution would be sufficient from my side, and I would appreciate a > quick application if there are no objections. As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen early in that file, it should likely become a CONFIG_ thing. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-10 9:53 ` David Hildenbrand (Red Hat) @ 2025-11-10 15:28 ` Catalin Marinas 2025-11-10 15:55 ` Catalin Marinas 2025-11-11 10:44 ` Jan Polensky 1 sibling, 1 reply; 21+ messages in thread From: Catalin Marinas @ 2025-11-10 15:28 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: Jan Polensky, akpm, linux-arm-kernel, linux-mm, will On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote: > On 10.11.25 10:48, Jan Polensky wrote: > > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: > > > On 09.11.25 01:36, Jan Polensky wrote: > > > > The previous change added __GFP_ZEROTAGS when allocating the huge zero > > > > folio to ensure tag initialization for arm64 with MTE enabled. However, > > > > on s390 this flag is unnecessary and triggers a regression > > > > (observed as a crash during repeated 'dnf makecache'). [...] > > > I think the problem is that post_alloc_hook() does > > > > > > if (zero_tags) { > > > /* Initialize both memory and memory tags. */ > > > for (i = 0; i != 1 << order; ++i) > > > tag_clear_highpage(page + i); > > > > > > /* Take note that memory was initialized by the loop above. */ > > > init = false; > > > } > > > > > > And tag_clear_highpage() is a NOP on other architectures. Hmm, another thing I missed. Sorry about this. > > Which works by the way for our arch (s390). > > > > include/linux/gfp_types.h | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h > > index 65db9349f905..c12d8a601bb3 100644 > > --- a/include/linux/gfp_types.h > > +++ b/include/linux/gfp_types.h > > @@ -85,7 +85,11 @@ enum { > > #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) > > #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) > > #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) > > +#else > > +#define ___GFP_ZEROTAGS 0 > > +#endif > > #ifdef CONFIG_KASAN_HW_TAGS > > #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) > > #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) > > > > This solution would be sufficient from my side, and I would appreciate a > > quick application if there are no objections. > > As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen > early in that file, it should likely become a CONFIG_ thing. I'm fine with either option above but I'll throw one more in the mix: --------------------8<-------------------------------- diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..dcff91533590 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, unsigned long vaddr); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio +bool arch_has_tag_clear_highpage(void); void tag_clear_highpage(struct page *to); #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 125dfa6c613b..318d091db843 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, return vma_alloc_folio(flags, 0, vma, vaddr); } +bool arch_has_tag_clear_highpage(void) +{ + return system_supports_mte(); +} + void tag_clear_highpage(struct page *page) { - /* - * Check if MTE is supported and fall back to clear_highpage(). - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and - * post_alloc_hook() will invoke tag_clear_highpage(). - */ - if (!system_supports_mte()) { - clear_highpage(page); - return; - } - /* Newly allocated page, shouldn't have been tagged yet */ WARN_ON_ONCE(!try_page_mte_tagging(page)); mte_zero_clear_page_tags(page_address(page)); diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 105cc4c00cc3..7aa56179ccef 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page) #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE +static inline bool arch_has_tag_clear_highpage(void) +{ + return false; +} + static inline void tag_clear_highpage(struct page *page) { } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e4efda1158b2..5ab15431bc06 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order, { bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && !should_skip_init(gfp_flags); - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS); + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) && + arch_has_tag_clear_highpage(); int i; set_page_private(page, 0); --------------------8<-------------------------------- Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the kernel which are also exposed to user because the tags are shared (same physical location). The 'zero_tags' initialisation in post_alloc_hook() makes sense for this behaviour. With virtual tagging (briefly announced in [1], full specs not public yet), both the user and the kernel can have their own tags - more like KASAN_SW_TAGS but without the compiler instrumentation. The kernel won't be able to zero the tags for the user since they are in virtual space. It can, however, continue to use Kasan tags even if the pages are mapped in user space. In this case, I'd rather use the kernel_init_pages() call further down in post_alloc_hook() than replicating it in tag_clear_highpage(). When we get to upstreaming virtual tagging (informally vMTE, sometime next year), I'd like to have a kernel image that supports both, so the decision on whether to call tag_clear_highpage() will need to be dynamic. [1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte -- Catalin ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-10 15:28 ` Catalin Marinas @ 2025-11-10 15:55 ` Catalin Marinas 0 siblings, 0 replies; 21+ messages in thread From: Catalin Marinas @ 2025-11-10 15:55 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: Jan Polensky, akpm, linux-arm-kernel, linux-mm, will On Mon, Nov 10, 2025 at 03:28:16PM +0000, Catalin Marinas wrote: > On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote: > > On 10.11.25 10:48, Jan Polensky wrote: > > > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: > > > > On 09.11.25 01:36, Jan Polensky wrote: > > > > > The previous change added __GFP_ZEROTAGS when allocating the huge zero > > > > > folio to ensure tag initialization for arm64 with MTE enabled. However, > > > > > on s390 this flag is unnecessary and triggers a regression > > > > > (observed as a crash during repeated 'dnf makecache'). > [...] > > > > I think the problem is that post_alloc_hook() does > > > > > > > > if (zero_tags) { > > > > /* Initialize both memory and memory tags. */ > > > > for (i = 0; i != 1 << order; ++i) > > > > tag_clear_highpage(page + i); > > > > > > > > /* Take note that memory was initialized by the loop above. */ > > > > init = false; > > > > } > > > > > > > > And tag_clear_highpage() is a NOP on other architectures. [...] > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 2312e6ee595f..dcff91533590 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > unsigned long vaddr); > #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio > > +bool arch_has_tag_clear_highpage(void); > void tag_clear_highpage(struct page *to); > #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index 125dfa6c613b..318d091db843 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, > return vma_alloc_folio(flags, 0, vma, vaddr); > } > > +bool arch_has_tag_clear_highpage(void) > +{ > + return system_supports_mte(); > +} > + > void tag_clear_highpage(struct page *page) > { > - /* > - * Check if MTE is supported and fall back to clear_highpage(). > - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and > - * post_alloc_hook() will invoke tag_clear_highpage(). > - */ > - if (!system_supports_mte()) { > - clear_highpage(page); > - return; > - } > - > /* Newly allocated page, shouldn't have been tagged yet */ > WARN_ON_ONCE(!try_page_mte_tagging(page)); > mte_zero_clear_page_tags(page_address(page)); > diff --git a/include/linux/highmem.h b/include/linux/highmem.h > index 105cc4c00cc3..7aa56179ccef 100644 > --- a/include/linux/highmem.h > +++ b/include/linux/highmem.h > @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page) > > #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > +static inline bool arch_has_tag_clear_highpage(void) > +{ > + return false; > +} > + > static inline void tag_clear_highpage(struct page *page) > { > } > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index e4efda1158b2..5ab15431bc06 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order, > { > bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && > !should_skip_init(gfp_flags); > - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS); > + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) && > + arch_has_tag_clear_highpage(); > int i; > > set_page_private(page, 0); > --------------------8<-------------------------------- > > Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the > kernel which are also exposed to user because the tags are shared (same > physical location). The 'zero_tags' initialisation in post_alloc_hook() > makes sense for this behaviour. With virtual tagging (briefly announced > in [1], full specs not public yet), both the user and the kernel can > have their own tags - more like KASAN_SW_TAGS but without the compiler > instrumentation. The kernel won't be able to zero the tags for the user > since they are in virtual space. It can, however, continue to use Kasan > tags even if the pages are mapped in user space. In this case, I'd > rather use the kernel_init_pages() call further down in > post_alloc_hook() than replicating it in tag_clear_highpage(). When we > get to upstreaming virtual tagging (informally vMTE, sometime next > year), I'd like to have a kernel image that supports both, so the > decision on whether to call tag_clear_highpage() will need to be > dynamic. Actually, there's not much to kernel_init_pages() other than disabling kasan temporarily since the unpoisoning already took place a few lines up. The arm64 tag_clear_highpage() calling clear_highpage() directly is fine before unpoisoning. So we can cope with this even in the vMTE case. A simple patch hiding the enum is fine by me. -- Catalin ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-10 9:53 ` David Hildenbrand (Red Hat) 2025-11-10 15:28 ` Catalin Marinas @ 2025-11-11 10:44 ` Jan Polensky 2025-11-11 12:27 ` David Hildenbrand (Red Hat) 1 sibling, 1 reply; 21+ messages in thread From: Jan Polensky @ 2025-11-11 10:44 UTC (permalink / raw) To: David Hildenbrand (Red Hat), catalin.marinas Cc: akpm, linux-arm-kernel, linux-mm, will On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote: > On 10.11.25 10:48, Jan Polensky wrote: > > On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: > > > On 09.11.25 01:36, Jan Polensky wrote: ---8<--- snip ---8<--- > > > I wonder if the following would work: > > > > > > > > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h > > > index 65db9349f9053..56b82e116cb79 100644 > > > --- a/include/linux/gfp_types.h > > > +++ b/include/linux/gfp_types.h > > > @@ -47,7 +47,9 @@ enum { > > > ___GFP_HARDWALL_BIT, > > > ___GFP_THISNODE_BIT, > > > ___GFP_ACCOUNT_BIT, > > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > > ___GFP_ZEROTAGS_BIT, > > > +#endif > > > #ifdef CONFIG_KASAN_HW_TAGS > > > ___GFP_SKIP_ZERO_BIT, > > > ___GFP_SKIP_KASAN_BIT, > > > @@ -85,7 +87,11 @@ enum { > > > #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) > > > #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) > > > #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) > > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > > #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) > > > +#else > > > +#define ___GFP_ZEROTAGS 0 > > > +#endif > > > #ifdef CONFIG_KASAN_HW_TAGS > > > #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) > > > #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) > > > > > > > > > Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper > > > kconfig option. > > > > > > > > > Then we could turn the default implementation of > > > tag_clear_highpage() into a BUILD_BUG. > > > > > I'd like to suggest to keep the enum untouched and only use the second > > part of your suggestion. > > Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and > CONFIG_SLAB_OBJ_EXT. If we remove the enum entry, we’d also need to update mmflags.h because the trace macros reference it. Enums are compile-time only, so they don’t affect the generated binary. My thought was to keep the enum list as it is and just apply the second part of your suggestion. That way, the trace definitions stay consistent without extra changes. Just an idea, happy to go with whatever you prefer. > > > Which works by the way for our arch (s390). > > > > include/linux/gfp_types.h | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h > > index 65db9349f905..c12d8a601bb3 100644 > > --- a/include/linux/gfp_types.h > > +++ b/include/linux/gfp_types.h > > @@ -85,7 +85,11 @@ enum { > > #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) > > #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) > > #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) > > +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE > > #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) > > +#else > > +#define ___GFP_ZEROTAGS 0 > > +#endif > > #ifdef CONFIG_KASAN_HW_TAGS > > #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) > > #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) > > > > This solution would be sufficient from my side, and I would appreciate a > > quick application if there are no objections. > > As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen > early in that file, it should likely become a CONFIG_ thing. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-11 10:44 ` Jan Polensky @ 2025-11-11 12:27 ` David Hildenbrand (Red Hat) 2025-11-11 12:28 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 21+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-11 12:27 UTC (permalink / raw) To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will On 11.11.25 11:44, Jan Polensky wrote: > On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote: >> On 10.11.25 10:48, Jan Polensky wrote: >>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: >>>> On 09.11.25 01:36, Jan Polensky wrote: > ---8<--- snip ---8<--- >>>> I wonder if the following would work: >>>> >>>> >>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h >>>> index 65db9349f9053..56b82e116cb79 100644 >>>> --- a/include/linux/gfp_types.h >>>> +++ b/include/linux/gfp_types.h >>>> @@ -47,7 +47,9 @@ enum { >>>> ___GFP_HARDWALL_BIT, >>>> ___GFP_THISNODE_BIT, >>>> ___GFP_ACCOUNT_BIT, >>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >>>> ___GFP_ZEROTAGS_BIT, >>>> +#endif >>>> #ifdef CONFIG_KASAN_HW_TAGS >>>> ___GFP_SKIP_ZERO_BIT, >>>> ___GFP_SKIP_KASAN_BIT, >>>> @@ -85,7 +87,11 @@ enum { >>>> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) >>>> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) >>>> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) >>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >>>> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) >>>> +#else >>>> +#define ___GFP_ZEROTAGS 0 >>>> +#endif >>>> #ifdef CONFIG_KASAN_HW_TAGS >>>> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) >>>> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) >>>> >>>> >>>> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper >>>> kconfig option. >>>> >>>> >>>> Then we could turn the default implementation of >>>> tag_clear_highpage() into a BUILD_BUG. >>>> >>> I'd like to suggest to keep the enum untouched and only use the second >>> part of your suggestion. >> >> Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and >> CONFIG_SLAB_OBJ_EXT. > If we remove the enum entry, we’d also need to update mmflags.h because > the trace macros reference it. > Enums are compile-time only, so they don’t affect the generated binary. > My thought was to keep the enum list as it is and just apply the second > part of your suggestion. > That way, the trace definitions stay consistent without extra changes. > Just an idea, happy to go with whatever you prefer. I think we'd remove the enum value as well, because then there is no way it could accidentally be reused. And yes, as you correctly state we'll have to update mmflags as well like we did for CONFIG_KASAN_HW_TAGS etc. -- Cheers David ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures 2025-11-11 12:27 ` David Hildenbrand (Red Hat) @ 2025-11-11 12:28 ` David Hildenbrand (Red Hat) 0 siblings, 0 replies; 21+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-11 12:28 UTC (permalink / raw) To: Jan Polensky, catalin.marinas; +Cc: akpm, linux-arm-kernel, linux-mm, will On 11.11.25 13:27, David Hildenbrand (Red Hat) wrote: > On 11.11.25 11:44, Jan Polensky wrote: >> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote: >>> On 10.11.25 10:48, Jan Polensky wrote: >>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote: >>>>> On 09.11.25 01:36, Jan Polensky wrote: >> ---8<--- snip ---8<--- >>>>> I wonder if the following would work: >>>>> >>>>> >>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h >>>>> index 65db9349f9053..56b82e116cb79 100644 >>>>> --- a/include/linux/gfp_types.h >>>>> +++ b/include/linux/gfp_types.h >>>>> @@ -47,7 +47,9 @@ enum { >>>>> ___GFP_HARDWALL_BIT, >>>>> ___GFP_THISNODE_BIT, >>>>> ___GFP_ACCOUNT_BIT, >>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >>>>> ___GFP_ZEROTAGS_BIT, >>>>> +#endif >>>>> #ifdef CONFIG_KASAN_HW_TAGS >>>>> ___GFP_SKIP_ZERO_BIT, >>>>> ___GFP_SKIP_KASAN_BIT, >>>>> @@ -85,7 +87,11 @@ enum { >>>>> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT) >>>>> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT) >>>>> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT) >>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE >>>>> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT) >>>>> +#else >>>>> +#define ___GFP_ZEROTAGS 0 >>>>> +#endif >>>>> #ifdef CONFIG_KASAN_HW_TAGS >>>>> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT) >>>>> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT) >>>>> >>>>> >>>>> Likely we'd have to make __HAVE_ARCH_TAG_CLEAR_HIGHPAGE a proper >>>>> kconfig option. >>>>> >>>>> >>>>> Then we could turn the default implementation of >>>>> tag_clear_highpage() into a BUILD_BUG. >>>>> >>>> I'd like to suggest to keep the enum untouched and only use the second >>>> part of your suggestion. >>> >>> Why? We also do that for CONFIG_KASAN_HW_TAGS, CONFIG_LOCKDEP and >>> CONFIG_SLAB_OBJ_EXT. >> If we remove the enum entry, we’d also need to update mmflags.h because >> the trace macros reference it. >> Enums are compile-time only, so they don’t affect the generated binary. >> My thought was to keep the enum list as it is and just apply the second >> part of your suggestion. >> That way, the trace definitions stay consistent without extra changes. >> Just an idea, happy to go with whatever you prefer. > > I think we'd remove the enum value as well, because then there is no way > it could accidentally be reused. > > And yes, as you correctly state we'll have to update mmflags as well > like we did for CONFIG_KASAN_HW_TAGS etc. /me realizing that my mail client decided to use yet another mail alias, I hope I have it fixed now such that everything is sent from my kernel.org account ... -- Cheers David ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-11-11 12:28 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas 2025-10-31 17:15 ` David Hildenbrand 2025-11-03 13:32 ` Mark Brown 2025-11-03 14:30 ` Catalin Marinas 2025-11-03 14:41 ` David Hildenbrand (Red Hat) 2025-11-03 15:59 ` Catalin Marinas 2025-11-03 19:29 ` Beleswar Prasad Padhi 2025-11-04 1:05 ` Andrew Morton 2025-11-04 8:52 ` Catalin Marinas 2025-11-04 11:53 ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang 2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky 2025-11-09 0:42 ` [PATCH] Clarification: please ignore earlier submission Jan Polensky 2025-11-09 0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky 2025-11-10 9:09 ` David Hildenbrand (Red Hat) 2025-11-10 9:48 ` Jan Polensky 2025-11-10 9:53 ` David Hildenbrand (Red Hat) 2025-11-10 15:28 ` Catalin Marinas 2025-11-10 15:55 ` Catalin Marinas 2025-11-11 10:44 ` Jan Polensky 2025-11-11 12:27 ` David Hildenbrand (Red Hat) 2025-11-11 12:28 ` David Hildenbrand (Red Hat)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).