The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area
@ 2026-06-06  3:49 Oscar Salvador
  2026-06-06  3:49 ` [PATCH 1/8] mm,hugetlb: Encode extra padding for aligning directly in hugetlb_get_unmapped_area Oscar Salvador
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador


A regression was reported on AMD 15h Family for hugetlb mappings when 'align_va_addr'
is set [1].
Historically, for hugetlb mappings we always ignored 'align_offset' in get_unmapped_area
functions, but after commit 7bd3f1e1a9ae ("mm: make hugetlb mappings go through
mm_get_unmapped_area_vmflags") that was no longer the case for x86.

While we could fix that by work it around in x86 code, the truth is that the current
functioning of hugetlb mappings with get_unmapped_area functions is a bit clumsy, and
we can do better.
This patchset aims at two things:

1) Fix regression reported in [1]
2) Stop special-casing hugetlb mappings in get_unmapped_area functions

Fixing the regression involve doing what we do for THP mappings, which is encoding
the extra padding we need for worst-case-alignment scenarios directly in the 'len'
parameter and stop using 'align_mask' for that matter.
And then, mask off the address we get from mm_get_unmapped_area_vmflags.

After that, since the set up of 'align_mask' for hugetlb mappings disappears, this
means we do not have to be hugetlb-aware down the road, and we can stop special casing
it.

The only thing to note though is that depending on the architecture, we sometimes
use 'align_mask' to do coloring for pagecache and to avoid cache aliasing.
This was ignored for hugetlb mappings, but after this patchset that is no longer the
case, so this means that 'align_mask' will be added on top of the requested length
of the mapping, making us to allocate a larger(ish) gap in the mapple tree, but
that should be harmless as most of the times that is PAGE_SIZE, and whatever
the offset is will be masked off in hugetlb_get_unmapped_area.

[NOTE]
Patch#1 is v2, since Dave Hansen was kind enough to provide some feedback
in [2] while this was still on the works.

[1] https://lore.kernel.org/linux-mm/20260527143643.GO31091@soohrt.org/
[2] https://lore.kernel.org/linux-mm/bae37c0c-f9e8-486b-a1a8-53c9dc0af8e0@intel.com/#t

Oscar Salvador (8):
  mm,hugetlb: Encode extra padding for aligning directly in
    hugetlb_get_unmapped_area
  mm/mmap: Stop special-casing hugetlb in generic_get_unmapped_area path
  arch/x86: Stop special-casing hugetlb mappings in
    arch_get_unmapped_area
  arch/s390: Stop special-casing hugetlb mappings in
    arch_get_unmapped_area
  arch/loongarch: Stop special-casing hugetlb in
    arch_get_unmapped_area_common
  arch/sparc64: Stop special-casing hugetlb mappings in
    arch_get_unmapped_area
  arch/sparc32: Rip out hugetlb from sys_sparc_32.c
  mm,hugetlb: Kill huge_page_mask_align

 arch/loongarch/mm/mmap.c         |  6 +-----
 arch/s390/mm/mmap.c              |  9 ++-------
 arch/sparc/kernel/sys_sparc_32.c | 16 +++-------------
 arch/sparc/kernel/sys_sparc_64.c | 25 ++++++-------------------
 arch/x86/kernel/sys_x86_64.c     | 15 ++++-----------
 fs/hugetlbfs/inode.c             | 25 ++++++++++++++++++++++++-
 include/linux/hugetlb.h          | 10 ----------
 mm/mmap.c                        |  4 ----
 8 files changed, 40 insertions(+), 70 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] mm,hugetlb: Encode extra padding for aligning directly in hugetlb_get_unmapped_area
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
@ 2026-06-06  3:49 ` Oscar Salvador
  2026-06-06  3:49 ` [PATCH 2/8] mm/mmap: Stop special-casing hugetlb in generic_get_unmapped_area path Oscar Salvador
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

Currently, for hugetlb mappings arches query the align_mask needed in order to
satisfy allocating a gap for the mapping, as it counts the worst-case scenario
which is being off by one page from boundary of the next hugetlb aligned address.

Later on, unmapped_area{_topdown} add this 'align_mask' on top of the
mapping's length we requested to count for worst-case scenario and get a gap
big enough, and then do the proper masking off once we get the address.

This means we have to special case hugetlb in quite a few places to first
1) ignore align_offset and 2) return a proper align_mask, but we can just do
as THP mappings do and encode the align_mask directly in hugetlb_get_unmapped_area
and pass that down to mm_get_unmapped_area_vmflags.

Once we get the address, we can properly mask it off before handing it over to
userspace.

This also allows us to fix a regression on AMD15h family for hugetlb
mappings [1].

[1] https://lore.kernel.org/linux-mm/20260527143643.GO31091@soohrt.org/

Fixes: 7bd3f1e1a9ae ("mm: make hugetlb mappings go through mm_get_unmapped_area_vmflags")
Reported-by: Karsten Desler <kdesler@soohrt.org>
Closes: https://lore.kernel.org/linux-mm/20260527143643.GO31091@soohrt.org/
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
 fs/hugetlbfs/inode.c    | 25 ++++++++++++++++++++++++-
 include/linux/hugetlb.h |  2 +-
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 78d61bf2bd9b..da9bfb5b79f7 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -184,7 +184,30 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	if (addr)
 		addr0 = ALIGN(addr, huge_page_size(h));
 
-	return mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
+	/*
+	 * hugetlb mappings need to be huge page aligned.
+	 * mm_get_unmapped_area_vmflags() only gives back PAGE_SIZE aligned
+	 * areas.
+	 *
+	 * Extend the requested unmapped area for the worse case scenario:
+	 * the address comes back and needs to be aligned up to by one small
+	 * page short of a large page.
+	 */
+	len += huge_page_size(h) - PAGE_SIZE;
+	/*
+	 * pgoff is being used for coloring, but hugetlb mappings need strict
+	 * alignment, so it is always ignored by special casing it down the road.
+	 * Set it to 0 here, so we do not need to be hugetlb-aware later on.
+	 */
+	pgoff = 0;
+	addr0 = mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
+	if (IS_ERR_VALUE(addr0))
+		return addr0;
+
+	/* Align the address to the next huge_page_size boundary */
+	addr0 = ALIGN(addr0, huge_page_size(h));
+
+	return addr0;
 }
 
 /*
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5957bc25efa8..924e9a464214 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1085,7 +1085,7 @@ bool is_raw_hwpoison_page_in_hugepage(struct page *page);
 
 static inline unsigned long huge_page_mask_align(struct file *file)
 {
-	return PAGE_MASK & ~huge_page_mask(hstate_file(file));
+	return 0;
 }
 
 #else	/* CONFIG_HUGETLB_PAGE */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] mm/mmap: Stop special-casing hugetlb in generic_get_unmapped_area path
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
  2026-06-06  3:49 ` [PATCH 1/8] mm,hugetlb: Encode extra padding for aligning directly in hugetlb_get_unmapped_area Oscar Salvador
@ 2026-06-06  3:49 ` Oscar Salvador
  2026-06-06  3:49 ` [PATCH 3/8] arch/x86: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

generic_get_unmapped_area* sets info.align_mask to make room for extra alignment,
so that is added on top of the length we requested in unmapped_area{_topdown}.
hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
parameter, and it also masks off the address it gets to properly align it to
the huge_page_size we are using.

Stop special-casing hugetlb in generic_get_unmapped_area* functions.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/mmap.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 5754d1c36462..e4defe9fb1b7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -715,8 +715,6 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr,
 	info.low_limit = mm->mmap_base;
 	info.high_limit = mmap_end;
 	info.start_gap = stack_guard_placement(vm_flags);
-	if (filp && is_file_hugepages(filp))
-		info.align_mask = huge_page_mask_align(filp);
 	return vm_unmapped_area(&info);
 }
 
@@ -767,8 +765,6 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = arch_get_mmap_base(addr, mm->mmap_base);
 	info.start_gap = stack_guard_placement(vm_flags);
-	if (filp && is_file_hugepages(filp))
-		info.align_mask = huge_page_mask_align(filp);
 	addr = vm_unmapped_area(&info);
 
 	/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] arch/x86: Stop special-casing hugetlb mappings in arch_get_unmapped_area
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
  2026-06-06  3:49 ` [PATCH 1/8] mm,hugetlb: Encode extra padding for aligning directly in hugetlb_get_unmapped_area Oscar Salvador
  2026-06-06  3:49 ` [PATCH 2/8] mm/mmap: Stop special-casing hugetlb in generic_get_unmapped_area path Oscar Salvador
@ 2026-06-06  3:49 ` Oscar Salvador
  2026-06-06  3:49 ` [PATCH 4/8] arch/s390: " Oscar Salvador
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

arch_get_unmapped_area* sets info.align_mask to make room for extra alignment,
so that is added on top of the length we request in unmapped_area{_topdown}.
hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
parameter, and it also masks off the address it gets to properly align it to
the huge_page_size we are using.

So, stop special-casing hugetlb in arch_get_unmapped_area* functions.

Also, there is no need to worry about align_offset and start_gap because
the former will be masked off back in hugetlb_get_unmapped_area() and
the latter only applies to mappings created via vm_mmap_shadow_stack() which
do not involve hugetlb.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 arch/x86/kernel/sys_x86_64.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 776ae6fa7f2d..1cf2dafd223b 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -18,7 +18,6 @@
 #include <linux/random.h>
 #include <linux/uaccess.h>
 #include <linux/elf.h>
-#include <linux/hugetlb.h>
 
 #include <asm/elf.h>
 #include <asm/ia32.h>
@@ -28,8 +27,6 @@
  */
 static unsigned long get_align_mask(struct file *filp)
 {
-	if (filp && is_file_hugepages(filp))
-		return huge_page_mask_align(filp);
 	/* handle 32- and 64-bit case with a single conditional */
 	if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32())))
 		return 0;
@@ -151,10 +148,8 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len,
 	info.length = len;
 	info.low_limit = begin;
 	info.high_limit = end;
-	if (!(filp && is_file_hugepages(filp))) {
-		info.align_offset = pgoff << PAGE_SHIFT;
-		info.start_gap = stack_guard_placement(vm_flags);
-	}
+	info.align_offset = pgoff << PAGE_SHIFT;
+	info.start_gap = stack_guard_placement(vm_flags);
 	if (filp) {
 		info.align_mask = get_align_mask(filp);
 		info.align_offset += get_align_bits();
@@ -205,10 +200,8 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr0,
 		info.low_limit = PAGE_SIZE;
 
 	info.high_limit = get_mmap_base(0);
-	if (!(filp && is_file_hugepages(filp))) {
-		info.start_gap = stack_guard_placement(vm_flags);
-		info.align_offset = pgoff << PAGE_SHIFT;
-	}
+	info.start_gap = stack_guard_placement(vm_flags);
+	info.align_offset = pgoff << PAGE_SHIFT;
 
 	/*
 	 * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] arch/s390: Stop special-casing hugetlb mappings in arch_get_unmapped_area
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (2 preceding siblings ...)
  2026-06-06  3:49 ` [PATCH 3/8] arch/x86: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
@ 2026-06-06  3:49 ` Oscar Salvador
  2026-06-30 17:10   ` Gerald Schaefer
  2026-06-06  3:50 ` [PATCH 5/8] arch/loongarch: Stop special-casing hugetlb in arch_get_unmapped_area_common Oscar Salvador
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

arch_get_unmapped_area* sets info.align_mask to make room for extra alignment,
so that is added on top of the length we request in unmapped_area{_topdown}.
hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
parameter, and it also masks off the address it gets to properly align it to
the huge_page_size we are using.

So, stop special-casing hugetlb in arch_get_unmapped_area* functions.

Also, there is no need to worry about align_offset because that will be
masked off back in hugetlb_get_unmapped_area().

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 arch/s390/mm/mmap.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index 2a222a7e14f4..46c29738c5ef 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -16,7 +16,6 @@
 #include <linux/sched/mm.h>
 #include <linux/random.h>
 #include <linux/security.h>
-#include <linux/hugetlb.h>
 #include <asm/elf.h>
 
 static unsigned long stack_maxrandom_size(void)
@@ -66,8 +65,6 @@ static inline unsigned long mmap_base(unsigned long rnd,
 
 static int get_align_mask(struct file *filp, unsigned long flags)
 {
-	if (filp && is_file_hugepages(filp))
-		return huge_page_mask_align(filp);
 	if (!(current->flags & PF_RANDOMIZE))
 		return 0;
 	if (filp || (flags & MAP_SHARED))
@@ -101,8 +98,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
 	info.align_mask = get_align_mask(filp, flags);
-	if (!(filp && is_file_hugepages(filp)))
-		info.align_offset = pgoff << PAGE_SHIFT;
+	info.align_offset = pgoff << PAGE_SHIFT;
 	addr = vm_unmapped_area(&info);
 	if (offset_in_page(addr))
 		return addr;
@@ -140,8 +136,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = mm->mmap_base;
 	info.align_mask = get_align_mask(filp, flags);
-	if (!(filp && is_file_hugepages(filp)))
-		info.align_offset = pgoff << PAGE_SHIFT;
+	info.align_offset = pgoff << PAGE_SHIFT;
 	addr = vm_unmapped_area(&info);
 
 	/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] arch/loongarch: Stop special-casing hugetlb in arch_get_unmapped_area_common
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (3 preceding siblings ...)
  2026-06-06  3:49 ` [PATCH 4/8] arch/s390: " Oscar Salvador
@ 2026-06-06  3:50 ` Oscar Salvador
  2026-06-06  3:50 ` [PATCH 6/8] arch/sparc64: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

arch_get_unmapped_* sets info.align_mask to make room for extra alignment,
so that is added on top of the length we request in unmapped_area{_topdown}.
hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
parameter, and it masks off the address it gets to properly align it to
the huge_page_size we are using.

Stop special-casing hugetlb in arch_get_unmapped_area_common.
The only caveat is that we might get extra align_mask in case we do coloring,
but that should only make us allocate a larger(ish) gap in unmapped_area{_topdown}.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 arch/loongarch/mm/mmap.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
index 1df9e99582cc..8f0af181f902 100644
--- a/arch/loongarch/mm/mmap.c
+++ b/arch/loongarch/mm/mmap.c
@@ -3,7 +3,6 @@
  * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
  */
 #include <linux/export.h>
-#include <linux/hugetlb.h>
 #include <linux/io.h>
 #include <linux/kfence.h>
 #include <linux/memblock.h>
@@ -65,10 +64,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 
 	info.length = len;
 	info.align_offset = pgoff << PAGE_SHIFT;
-	if (filp && is_file_hugepages(filp))
-		info.align_mask = huge_page_mask_align(filp);
-	else
-		info.align_mask = do_color_align ? (PAGE_MASK & SHM_ALIGN_MASK) : 0;
+	info.align_mask = do_color_align ? (PAGE_MASK & SHM_ALIGN_MASK) : 0;
 
 	if (dir == DOWN) {
 		info.flags = VM_UNMAPPED_AREA_TOPDOWN;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] arch/sparc64: Stop special-casing hugetlb mappings in arch_get_unmapped_area
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (4 preceding siblings ...)
  2026-06-06  3:50 ` [PATCH 5/8] arch/loongarch: Stop special-casing hugetlb in arch_get_unmapped_area_common Oscar Salvador
@ 2026-06-06  3:50 ` Oscar Salvador
  2026-06-06  3:50 ` [PATCH 7/8] arch/sparc32: Rip out hugetlb from sys_sparc_32.c Oscar Salvador
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

arch_get_unmapped_area* sets info.align_mask to make room for extra alignment,
so that is added on top of the length we request in unmapped_area{_topdown}.
hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
parameter, and it also masks off the address it gets to properly align it to
the huge_page_size we are using.

So, stop special-casing hugetlb in arch_get_unmapped_area* functions.

Also, there is no need to worry about align_offset because that will be
masked off back in hugetlb_get_unmapped_area().

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 arch/sparc/kernel/sys_sparc_64.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index ecefcffcf7b1..6823162befa0 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -30,7 +30,6 @@
 #include <linux/context_tracking.h>
 #include <linux/timex.h>
 #include <linux/uaccess.h>
-#include <linux/hugetlb.h>
 
 #include <asm/utrap.h>
 #include <asm/unistd.h>
@@ -90,8 +89,6 @@ static inline unsigned long COLOR_ALIGN(unsigned long addr,
 
 static unsigned long get_align_mask(struct file *filp, unsigned long flags)
 {
-	if (filp && is_file_hugepages(filp))
-		return huge_page_mask_align(filp);
 	if (filp || (flags & MAP_SHARED))
 		return PAGE_MASK & (SHMLBA - 1);
 
@@ -105,16 +102,12 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	unsigned long task_size = TASK_SIZE;
 	int do_color_align;
 	struct vm_unmapped_area_info info = {};
-	bool file_hugepage = false;
-
-	if (filp && is_file_hugepages(filp))
-		file_hugepage = true;
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
 		 * cache aliasing constraints.
 		 */
-		if (!file_hugepage && (flags & MAP_SHARED) &&
+		if ((flags & MAP_SHARED) &&
 		    ((addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)))
 			return -EINVAL;
 		return addr;
@@ -126,7 +119,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 		return -ENOMEM;
 
 	do_color_align = 0;
-	if ((filp || (flags & MAP_SHARED)) && !file_hugepage)
+	if ((filp || (flags & MAP_SHARED)))
 		do_color_align = 1;
 
 	if (addr) {
@@ -145,8 +138,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	info.low_limit = TASK_UNMAPPED_BASE;
 	info.high_limit = min(task_size, VA_EXCLUDE_START);
 	info.align_mask = get_align_mask(filp, flags);
-	if (!file_hugepage)
-		info.align_offset = pgoff << PAGE_SHIFT;
+	info.align_offset = pgoff << PAGE_SHIFT;
 	addr = vm_unmapped_area(&info);
 
 	if ((addr & ~PAGE_MASK) && task_size > VA_EXCLUDE_END) {
@@ -170,19 +162,15 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	unsigned long addr = addr0;
 	int do_color_align;
 	struct vm_unmapped_area_info info = {};
-	bool file_hugepage = false;
 
 	/* This should only ever run for 32-bit processes.  */
 	BUG_ON(!test_thread_flag(TIF_32BIT));
 
-	if (filp && is_file_hugepages(filp))
-		file_hugepage = true;
-
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
 		 * cache aliasing constraints.
 		 */
-		if (!file_hugepage && (flags & MAP_SHARED) &&
+		if ((flags & MAP_SHARED) &&
 		    ((addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)))
 			return -EINVAL;
 		return addr;
@@ -192,7 +180,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		return -ENOMEM;
 
 	do_color_align = 0;
-	if ((filp || (flags & MAP_SHARED)) && !file_hugepage)
+	if ((filp || (flags & MAP_SHARED)))
 		do_color_align = 1;
 
 	/* requesting a specific address */
@@ -213,8 +201,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = mm->mmap_base;
 	info.align_mask = get_align_mask(filp, flags);
-	if (!file_hugepage)
-		info.align_offset = pgoff << PAGE_SHIFT;
+	info.align_offset = pgoff << PAGE_SHIFT;
 	addr = vm_unmapped_area(&info);
 
 	/*
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] arch/sparc32: Rip out hugetlb from sys_sparc_32.c
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (5 preceding siblings ...)
  2026-06-06  3:50 ` [PATCH 6/8] arch/sparc64: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
@ 2026-06-06  3:50 ` Oscar Salvador
  2026-06-06  3:50 ` [PATCH 8/8] mm,hugetlb: Kill huge_page_mask_align Oscar Salvador
  2026-06-28  6:55 ` [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Andrew Morton
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

ARCH_SUPPORTS_HUGETLBFS is only select for sparc64, so rip out hugetlb-special-casing
from sys_sparc_32.c.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 arch/sparc/kernel/sys_sparc_32.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/arch/sparc/kernel/sys_sparc_32.c b/arch/sparc/kernel/sys_sparc_32.c
index fb31bc0c5b48..960f719b2257 100644
--- a/arch/sparc/kernel/sys_sparc_32.c
+++ b/arch/sparc/kernel/sys_sparc_32.c
@@ -23,7 +23,6 @@
 #include <linux/utsname.h>
 #include <linux/smp.h>
 #include <linux/ipc.h>
-#include <linux/hugetlb.h>
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -43,16 +42,12 @@ SYSCALL_DEFINE0(getpagesize)
 unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
 {
 	struct vm_unmapped_area_info info = {};
-	bool file_hugepage = false;
-
-	if (filp && is_file_hugepages(filp))
-		file_hugepage = true;
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
 		 * cache aliasing constraints.
 		 */
-		if (!file_hugepage && (flags & MAP_SHARED) &&
+		if ((flags & MAP_SHARED) &&
 		    ((addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1)))
 			return -EINVAL;
 		return addr;
@@ -67,13 +62,8 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	info.length = len;
 	info.low_limit = addr;
 	info.high_limit = TASK_SIZE;
-	if (!file_hugepage) {
-		info.align_mask = (flags & MAP_SHARED) ?
-			(PAGE_MASK & (SHMLBA - 1)) : 0;
-		info.align_offset = pgoff << PAGE_SHIFT;
-	} else {
-		info.align_mask = huge_page_mask_align(filp);
-	}
+	info.align_mask = (flags & MAP_SHARED) ? (PAGE_MASK & (SHMLBA - 1)) : 0;
+	info.align_offset = pgoff << PAGE_SHIFT;
 	return vm_unmapped_area(&info);
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] mm,hugetlb: Kill huge_page_mask_align
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (6 preceding siblings ...)
  2026-06-06  3:50 ` [PATCH 7/8] arch/sparc32: Rip out hugetlb from sys_sparc_32.c Oscar Salvador
@ 2026-06-06  3:50 ` Oscar Salvador
  2026-06-28  6:55 ` [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Andrew Morton
  8 siblings, 0 replies; 12+ messages in thread
From: Oscar Salvador @ 2026-06-06  3:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm, Oscar Salvador

There are no more users of huge_page_mask_align, so kill it.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 include/linux/hugetlb.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 924e9a464214..24e5079aa767 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1083,19 +1083,9 @@ void hugetlb_unregister_node(struct node *node);
  */
 bool is_raw_hwpoison_page_in_hugepage(struct page *page);
 
-static inline unsigned long huge_page_mask_align(struct file *file)
-{
-	return 0;
-}
-
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 
-static inline unsigned long huge_page_mask_align(struct file *file)
-{
-	return 0;
-}
-
 static inline struct hugepage_subpool *hugetlb_folio_subpool(struct folio *folio)
 {
 	return NULL;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area
  2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
                   ` (7 preceding siblings ...)
  2026-06-06  3:50 ` [PATCH 8/8] mm,hugetlb: Kill huge_page_mask_align Oscar Salvador
@ 2026-06-28  6:55 ` Andrew Morton
  2026-06-29 10:21   ` Lorenzo Stoakes
  8 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2026-06-28  6:55 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Dave Hansen, Karsten Desler, Muchun Song, David Hildenbrand,
	Lorenzo Stoakes, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm

On Sat,  6 Jun 2026 05:49:55 +0200 Oscar Salvador <osalvador@suse.de> wrote:

> A regression was reported on AMD 15h Family for hugetlb mappings when 'align_va_addr'
> is set [1].

It would be helpful to say right here that this regression results in a
runtime BUG().  Helps get attention ;)

> Historically, for hugetlb mappings we always ignored 'align_offset' in get_unmapped_area
> functions, but after commit 7bd3f1e1a9ae ("mm: make hugetlb mappings go through
> mm_get_unmapped_area_vmflags") that was no longer the case for x86.
> 
> While we could fix that by work it around in x86 code, the truth is that the current
> functioning of hugetlb mappings with get_unmapped_area functions is a bit clumsy, and
> we can do better.
> This patchset aims at two things:
> 
> 1) Fix regression reported in [1]
> 2) Stop special-casing hugetlb mappings in get_unmapped_area functions

OK, so the offending commit was about 1.5 years ago.

Do we want to fix -stable kernels?  If so, can we start out with
something minimal for backporting?  And narrow down its Fixes:?

The series was sent at an awkward time in the -rc cycle, which perhaps
explains the lack of feedback.  I suggest a refresh/retest/resend to
help bring people up to speed.

Sashiko has quite a lot to say - I hope some of it is useful?
	https://sashiko.dev/#/patchset/20260606035003.529685-1-osalvador@suse.de


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area
  2026-06-28  6:55 ` [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Andrew Morton
@ 2026-06-29 10:21   ` Lorenzo Stoakes
  0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2026-06-29 10:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Dave Hansen, Karsten Desler, Muchun Song,
	David Hildenbrand, Vlastimil Babka, Liam R . Howlett,
	Andreas Larsson, David S . Miller, Huacai Chen, Alexander Gordeev,
	Gerald Schaefer, linux-kernel, linux-mm

On Sat, Jun 27, 2026 at 11:55:12PM -0700, Andrew Morton wrote:
> On Sat,  6 Jun 2026 05:49:55 +0200 Oscar Salvador <osalvador@suse.de> wrote:
>
> > A regression was reported on AMD 15h Family for hugetlb mappings when 'align_va_addr'
> > is set [1].
>
> It would be helpful to say right here that this regression results in a
> runtime BUG().  Helps get attention ;)
>
> > Historically, for hugetlb mappings we always ignored 'align_offset' in get_unmapped_area
> > functions, but after commit 7bd3f1e1a9ae ("mm: make hugetlb mappings go through
> > mm_get_unmapped_area_vmflags") that was no longer the case for x86.
> >
> > While we could fix that by work it around in x86 code, the truth is that the current
> > functioning of hugetlb mappings with get_unmapped_area functions is a bit clumsy, and
> > we can do better.
> > This patchset aims at two things:
> >
> > 1) Fix regression reported in [1]
> > 2) Stop special-casing hugetlb mappings in get_unmapped_area functions
>
> OK, so the offending commit was about 1.5 years ago.
>
> Do we want to fix -stable kernels?  If so, can we start out with
> something minimal for backporting?  And narrow down its Fixes:?
>
> The series was sent at an awkward time in the -rc cycle, which perhaps
> explains the lack of feedback.  I suggest a refresh/retest/resend to
> help bring people up to speed.

Yes please rebase + resend :)

>
> Sashiko has quite a lot to say - I hope some of it is useful?
> 	https://sashiko.dev/#/patchset/20260606035003.529685-1-osalvador@suse.de
>

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/8] arch/s390: Stop special-casing hugetlb mappings in arch_get_unmapped_area
  2026-06-06  3:49 ` [PATCH 4/8] arch/s390: " Oscar Salvador
@ 2026-06-30 17:10   ` Gerald Schaefer
  0 siblings, 0 replies; 12+ messages in thread
From: Gerald Schaefer @ 2026-06-30 17:10 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Andrew Morton, Dave Hansen, Karsten Desler, Muchun Song,
	David Hildenbrand, Lorenzo Stoakes, Vlastimil Babka,
	Liam R . Howlett, Andreas Larsson, David S . Miller, Huacai Chen,
	Alexander Gordeev, linux-kernel, linux-mm, linux-s390

On Sat,  6 Jun 2026 05:49:59 +0200
Oscar Salvador <osalvador@suse.de> wrote:

> arch_get_unmapped_area* sets info.align_mask to make room for extra alignment,
> so that is added on top of the length we request in unmapped_area{_topdown}.
> hugetlb_get_unmapped_area() already adds this extra padding in the 'len'
> parameter, and it also masks off the address it gets to properly align it to
> the huge_page_size we are using.
> 
> So, stop special-casing hugetlb in arch_get_unmapped_area* functions.
> 
> Also, there is no need to worry about align_offset because that will be
> masked off back in hugetlb_get_unmapped_area().
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>  arch/s390/mm/mmap.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)

With regard to the Critical finding for s390 in Sashiko review in
https://sashiko.dev/#/patchset/20260606035003.529685-1-osalvador@suse.de

Yes, I think crst_table_upgrade() could be skipped "If the original length
fits right below TASK_SIZE, but the inflated length pushes addr + len over
TASK_SIZE".

But subsequent page faults should then generate an ASCE-type exception,
killing the user space program, and not alias with lower virtual addresses
causing memory corruption.

Still, I wonder if we want an extra check for "addr + (inflated) len > TASK_SIZE"
in check_asce_limit(), or somewhere else.

This "inflated length" approach also seems to have other subtle impact for
other archs, according to Sashiko. Possibly resulting in failed mappings for
valid addresses and ranges. So some extra checking or retry logic might be
needed anyway.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-30 17:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-06  3:49 [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Oscar Salvador
2026-06-06  3:49 ` [PATCH 1/8] mm,hugetlb: Encode extra padding for aligning directly in hugetlb_get_unmapped_area Oscar Salvador
2026-06-06  3:49 ` [PATCH 2/8] mm/mmap: Stop special-casing hugetlb in generic_get_unmapped_area path Oscar Salvador
2026-06-06  3:49 ` [PATCH 3/8] arch/x86: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
2026-06-06  3:49 ` [PATCH 4/8] arch/s390: " Oscar Salvador
2026-06-30 17:10   ` Gerald Schaefer
2026-06-06  3:50 ` [PATCH 5/8] arch/loongarch: Stop special-casing hugetlb in arch_get_unmapped_area_common Oscar Salvador
2026-06-06  3:50 ` [PATCH 6/8] arch/sparc64: Stop special-casing hugetlb mappings in arch_get_unmapped_area Oscar Salvador
2026-06-06  3:50 ` [PATCH 7/8] arch/sparc32: Rip out hugetlb from sys_sparc_32.c Oscar Salvador
2026-06-06  3:50 ` [PATCH 8/8] mm,hugetlb: Kill huge_page_mask_align Oscar Salvador
2026-06-28  6:55 ` [PATCH 0/8] Stop special-casing hugetlb mappings in get_unmapped_area Andrew Morton
2026-06-29 10:21   ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox