public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
       [not found] <20240326021656.202649-1-rick.p.edgecombe@intel.com>
@ 2024-03-26  2:16 ` Rick Edgecombe
  2024-03-26  3:32   ` Alexei Starovoitov
                     ` (2 more replies)
  2024-03-26  2:16 ` [PATCH v4 10/14] treewide: Use initializer for struct vm_unmapped_area_info Rick Edgecombe
  1 sibling, 3 replies; 9+ messages in thread
From: Rick Edgecombe @ 2024-03-26  2:16 UTC (permalink / raw)
  To: Liam.Howlett, akpm, bp, broonie, christophe.leroy, dave.hansen,
	debug, hpa, keescook, kirill.shutemov, luto, mingo, peterz, tglx,
	x86
  Cc: rick.p.edgecombe, linux-kernel, linux-mm, linux-s390, sparclinux,
	linux-sgx, nvdimm, linux-cxl, linux-fsdevel, io-uring, bpf

The mm_struct contains a function pointer *get_unmapped_area(), which
is set to either arch_get_unmapped_area() or
arch_get_unmapped_area_topdown() during the initialization of the mm.

Since the function pointer only ever points to two functions that are named
the same across all arch's, a function pointer is not really required. In
addition future changes will want to add versions of the functions that
take additional arguments. So to save a pointers worth of bytes in
mm_struct, and prevent adding additional function pointers to mm_struct in
future changes, remove it and keep the information about which
get_unmapped_area() to use in a flag.

Add the new flag to MMF_INIT_MASK so it doesn't get clobbered on fork by
mmf_init_flags(). Most MM flags get clobbered on fork. In the pre-existing
behavior mm->get_unmapped_area() would get copied to the new mm in
dup_mm(), so not clobbering the flag preserves the existing behavior
around inheriting the topdown-ness.

Introduce a helper, mm_get_unmapped_area(), to easily convert code that
refers to the old function pointer to instead select and call either
arch_get_unmapped_area() or arch_get_unmapped_area_topdown() based on the
flag. Then drop the mm->get_unmapped_area() function pointer. Leave the
get_unmapped_area() pointer in struct file_operations alone. The main
purpose of this change is to reorganize in preparation for future changes,
but it also converts the calls of mm->get_unmapped_area() from indirect
branches into a direct ones.

The stress-ng bigheap benchmark calls realloc a lot, which calls through
get_unmapped_area() in the kernel. On x86, the change yielded a ~1%
improvement there on a retpoline config.

In testing a few x86 configs, removing the pointer unfortunately didn't
result in any actual size reductions in the compiled layout of mm_struct.
But depending on compiler or arch alignment requirements, the change could
shrink the size of mm_struct.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-sgx@vger.kernel.org
Cc: nvdimm@lists.linux.dev
Cc: linux-cxl@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-fsdevel@vger.kernel.org
Cc: io-uring@vger.kernel.org
Cc: bpf@vger.kernel.org
---
v4:
 - Split out pde_get_unmapped_area() refactor into separate patch
   (Christophe Leroy)

v3:
 - Fix comment that still referred to mm->get_unmapped_area()
 - Resolve trivial rebase conflicts with "mm: thp_get_unmapped_area must
   honour topdown preference"
 - Spelling fix in log

v2:
 - Fix comment on MMF_TOPDOWN (Kirill, rppt)
 - Move MMF_TOPDOWN to actually unused bit
 - Add MMF_TOPDOWN to MMF_INIT_MASK so it doesn't get clobbered on fork,
   and result in the children using the search up path.
 - New lower performance results after above bug fix
 - Add Reviews and Acks
---
 arch/s390/mm/hugetlbpage.c       |  2 +-
 arch/s390/mm/mmap.c              |  4 ++--
 arch/sparc/kernel/sys_sparc_64.c | 15 ++++++---------
 arch/sparc/mm/hugetlbpage.c      |  2 +-
 arch/x86/kernel/cpu/sgx/driver.c |  2 +-
 arch/x86/mm/hugetlbpage.c        |  2 +-
 arch/x86/mm/mmap.c               |  4 ++--
 drivers/char/mem.c               |  2 +-
 drivers/dax/device.c             |  6 +++---
 fs/hugetlbfs/inode.c             |  4 ++--
 fs/proc/inode.c                  |  3 ++-
 fs/ramfs/file-mmu.c              |  2 +-
 include/linux/mm_types.h         |  6 +-----
 include/linux/sched/coredump.h   |  5 ++++-
 include/linux/sched/mm.h         |  5 +++++
 io_uring/io_uring.c              |  2 +-
 kernel/bpf/arena.c               |  2 +-
 kernel/bpf/syscall.c             |  2 +-
 mm/debug.c                       |  6 ------
 mm/huge_memory.c                 |  9 ++++-----
 mm/mmap.c                        | 21 ++++++++++++++++++---
 mm/shmem.c                       | 11 +++++------
 mm/util.c                        |  6 +++---
 23 files changed, 66 insertions(+), 57 deletions(-)

diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index c2e8242bd15d..219d906fe830 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -328,7 +328,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 			goto check_asce_limit;
 	}
 
-	if (mm->get_unmapped_area == arch_get_unmapped_area)
+	if (!test_bit(MMF_TOPDOWN, &mm->flags))
 		addr = hugetlb_get_unmapped_area_bottomup(file, addr, len,
 				pgoff, flags);
 	else
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index b14fc0887654..6b2e4436ad4a 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -185,10 +185,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 	 */
 	if (mmap_is_legacy(rlim_stack)) {
 		mm->mmap_base = mmap_base_legacy(random_factor);
-		mm->get_unmapped_area = arch_get_unmapped_area;
+		clear_bit(MMF_TOPDOWN, &mm->flags);
 	} else {
 		mm->mmap_base = mmap_base(random_factor, rlim_stack);
-		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+		set_bit(MMF_TOPDOWN, &mm->flags);
 	}
 }
 
diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 1e9a9e016237..1dbf7211666e 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -218,14 +218,10 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	unsigned long align_goal, addr = -ENOMEM;
-	unsigned long (*get_area)(struct file *, unsigned long,
-				  unsigned long, unsigned long, unsigned long);
-
-	get_area = current->mm->get_unmapped_area;
 
 	if (flags & MAP_FIXED) {
 		/* Ok, don't mess with it. */
-		return get_area(NULL, orig_addr, len, pgoff, flags);
+		return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);
 	}
 	flags &= ~MAP_SHARED;
 
@@ -238,7 +234,8 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
 		align_goal = (64UL * 1024);
 
 	do {
-		addr = get_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff, flags);
+		addr = mm_get_unmapped_area(current->mm, NULL, orig_addr,
+					    len + (align_goal - PAGE_SIZE), pgoff, flags);
 		if (!(addr & ~PAGE_MASK)) {
 			addr = (addr + (align_goal - 1UL)) & ~(align_goal - 1UL);
 			break;
@@ -256,7 +253,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
 	 * be obtained.
 	 */
 	if (addr & ~PAGE_MASK)
-		addr = get_area(NULL, orig_addr, len, pgoff, flags);
+		addr = mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);
 
 	return addr;
 }
@@ -292,7 +289,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 	    gap == RLIM_INFINITY ||
 	    sysctl_legacy_va_layout) {
 		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-		mm->get_unmapped_area = arch_get_unmapped_area;
+		clear_bit(MMF_TOPDOWN, &mm->flags);
 	} else {
 		/* We know it's 32-bit */
 		unsigned long task_size = STACK_TOP32;
@@ -303,7 +300,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 			gap = (task_size / 6 * 5);
 
 		mm->mmap_base = PAGE_ALIGN(task_size - gap - random_factor);
-		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+		set_bit(MMF_TOPDOWN, &mm->flags);
 	}
 }
 
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index b432500c13a5..38a1bef47efb 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -123,7 +123,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
-	if (mm->get_unmapped_area == arch_get_unmapped_area)
+	if (!test_bit(MMF_TOPDOWN, &mm->flags))
 		return hugetlb_get_unmapped_area_bottomup(file, addr, len,
 				pgoff, flags);
 	else
diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index 262f5fb18d74..22b65a5f5ec6 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -113,7 +113,7 @@ static unsigned long sgx_get_unmapped_area(struct file *file,
 	if (flags & MAP_FIXED)
 		return addr;
 
-	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
 }
 
 #ifdef CONFIG_COMPAT
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 5804bbae4f01..6d77c0039617 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -141,7 +141,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	}
 
 get_unmapped_area:
-	if (mm->get_unmapped_area == arch_get_unmapped_area)
+	if (!test_bit(MMF_TOPDOWN, &mm->flags))
 		return hugetlb_get_unmapped_area_bottomup(file, addr, len,
 				pgoff, flags);
 	else
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index c90c20904a60..a2cabb1c81e1 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -129,9 +129,9 @@ static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
 void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 {
 	if (mmap_is_legacy())
-		mm->get_unmapped_area = arch_get_unmapped_area;
+		clear_bit(MMF_TOPDOWN, &mm->flags);
 	else
-		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+		set_bit(MMF_TOPDOWN, &mm->flags);
 
 	arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base,
 			arch_rnd(mmap64_rnd_bits), task_size_64bit(0),
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 3c6670cf905f..9b80e622ae80 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -544,7 +544,7 @@ static unsigned long get_unmapped_area_zero(struct file *file,
 	}
 
 	/* Otherwise flags & MAP_PRIVATE: with no shmem object beneath it */
-	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
 #else
 	return -ENOSYS;
 #endif
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 93ebedc5ec8c..47c126d37b59 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -329,14 +329,14 @@ static unsigned long dax_get_unmapped_area(struct file *filp,
 	if ((off + len_align) < off)
 		goto out;
 
-	addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
-			pgoff, flags);
+	addr_align = mm_get_unmapped_area(current->mm, filp, addr, len_align,
+					  pgoff, flags);
 	if (!IS_ERR_VALUE(addr_align)) {
 		addr_align += (off - addr_align) & (align - 1);
 		return addr_align;
 	}
  out:
-	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
 }
 
 static const struct address_space_operations dev_dax_aops = {
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 6502c7e776d1..3dee18bf47ed 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -249,11 +249,11 @@ generic_hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	}
 
 	/*
-	 * Use mm->get_unmapped_area value as a hint to use topdown routine.
+	 * Use MMF_TOPDOWN flag as a hint to use topdown routine.
 	 * If architectures have special needs, they should define their own
 	 * version of hugetlb_get_unmapped_area.
 	 */
-	if (mm->get_unmapped_area == arch_get_unmapped_area_topdown)
+	if (test_bit(MMF_TOPDOWN, &mm->flags))
 		return hugetlb_get_unmapped_area_topdown(file, addr, len,
 				pgoff, flags);
 	return hugetlb_get_unmapped_area_bottomup(file, addr, len,
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 75396a24fd8c..d19434e2a58e 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -455,8 +455,9 @@ pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned lo
 		return pde->proc_ops->proc_get_unmapped_area(file, orig_addr, len, pgoff, flags);
 
 #ifdef CONFIG_MMU
-	return current->mm->get_unmapped_area(file, orig_addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, file, orig_addr, len, pgoff, flags);
 #endif
+
 	return orig_addr;
 }
 
diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
index c7a1aa3c882b..b45c7edc3225 100644
--- a/fs/ramfs/file-mmu.c
+++ b/fs/ramfs/file-mmu.c
@@ -35,7 +35,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct file *file,
 		unsigned long addr, unsigned long len, unsigned long pgoff,
 		unsigned long flags)
 {
-	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
 }
 
 const struct file_operations ramfs_file_operations = {
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5240bd7bca33..9313e43123d4 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -777,11 +777,7 @@ struct mm_struct {
 		} ____cacheline_aligned_in_smp;
 
 		struct maple_tree mm_mt;
-#ifdef CONFIG_MMU
-		unsigned long (*get_unmapped_area) (struct file *filp,
-				unsigned long addr, unsigned long len,
-				unsigned long pgoff, unsigned long flags);
-#endif
+
 		unsigned long mmap_base;	/* base of mmap area */
 		unsigned long mmap_legacy_base;	/* base of mmap area in bottom-up allocations */
 #ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 02f5090ffea2..e62ff805cfc9 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -92,9 +92,12 @@ static inline int get_dumpable(struct mm_struct *mm)
 #define MMF_VM_MERGE_ANY	30
 #define MMF_VM_MERGE_ANY_MASK	(1 << MMF_VM_MERGE_ANY)
 
+#define MMF_TOPDOWN		31	/* mm searches top down by default */
+#define MMF_TOPDOWN_MASK	(1 << MMF_TOPDOWN)
+
 #define MMF_INIT_MASK		(MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
 				 MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\
-				 MMF_VM_MERGE_ANY_MASK)
+				 MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK)
 
 static inline unsigned long mmf_init_flags(unsigned long flags)
 {
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index b6543f9d78d6..ed1caa26c8be 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -8,6 +8,7 @@
 #include <linux/mm_types.h>
 #include <linux/gfp.h>
 #include <linux/sync_core.h>
+#include <linux/sched/coredump.h>
 
 /*
  * Routines for handling mm_structs
@@ -186,6 +187,10 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
 			  unsigned long len, unsigned long pgoff,
 			  unsigned long flags);
 
+unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp,
+				   unsigned long addr, unsigned long len,
+				   unsigned long pgoff, unsigned long flags);
+
 unsigned long
 generic_get_unmapped_area(struct file *filp, unsigned long addr,
 			  unsigned long len, unsigned long pgoff,
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 5d4b448fdc50..405bab0a560c 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3520,7 +3520,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
 #else
 	addr = 0UL;
 #endif
-	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
 }
 
 #else /* !CONFIG_MMU */
diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
index 86571e760dd6..74d566dcd2cb 100644
--- a/kernel/bpf/arena.c
+++ b/kernel/bpf/arena.c
@@ -314,7 +314,7 @@ static unsigned long arena_get_unmapped_area(struct file *filp, unsigned long ad
 			return -EINVAL;
 	}
 
-	ret = current->mm->get_unmapped_area(filp, addr, len * 2, 0, flags);
+	ret = mm_get_unmapped_area(current->mm, filp, addr, len * 2, 0, flags);
 	if (IS_ERR_VALUE(ret))
 		return ret;
 	if ((ret >> 32) == ((ret + len - 1) >> 32))
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ae2ff73bde7e..dead5e1977d8 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -980,7 +980,7 @@ static unsigned long bpf_get_unmapped_area(struct file *filp, unsigned long addr
 	if (map->ops->map_get_unmapped_area)
 		return map->ops->map_get_unmapped_area(filp, addr, len, pgoff, flags);
 #ifdef CONFIG_MMU
-	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
 #else
 	return addr;
 #endif
diff --git a/mm/debug.c b/mm/debug.c
index c1c1a6a484e4..37a17f77df9f 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -180,9 +180,6 @@ EXPORT_SYMBOL(dump_vma);
 void dump_mm(const struct mm_struct *mm)
 {
 	pr_emerg("mm %px task_size %lu\n"
-#ifdef CONFIG_MMU
-		"get_unmapped_area %px\n"
-#endif
 		"mmap_base %lu mmap_legacy_base %lu\n"
 		"pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n"
 		"hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n"
@@ -208,9 +205,6 @@ void dump_mm(const struct mm_struct *mm)
 		"def_flags: %#lx(%pGv)\n",
 
 		mm, mm->task_size,
-#ifdef CONFIG_MMU
-		mm->get_unmapped_area,
-#endif
 		mm->mmap_base, mm->mmap_legacy_base,
 		mm->pgd, atomic_read(&mm->mm_users),
 		atomic_read(&mm->mm_count),
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9859aa4f7553..cede9ccb84dc 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -824,8 +824,8 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
 	if (len_pad < len || (off + len_pad) < off)
 		return 0;
 
-	ret = current->mm->get_unmapped_area(filp, addr, len_pad,
-					      off >> PAGE_SHIFT, flags);
+	ret = mm_get_unmapped_area(current->mm, filp, addr, len_pad,
+				   off >> PAGE_SHIFT, flags);
 
 	/*
 	 * The failure might be due to length padding. The caller will retry
@@ -843,8 +843,7 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
 
 	off_sub = (off - ret) & (size - 1);
 
-	if (current->mm->get_unmapped_area == arch_get_unmapped_area_topdown &&
-	    !off_sub)
+	if (test_bit(MMF_TOPDOWN, &current->mm->flags) && !off_sub)
 		return ret + size;
 
 	ret += off_sub;
@@ -861,7 +860,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (ret)
 		return ret;
 
-	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
 }
 EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 6dbda99a47da..224e9ce1e2fd 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1813,7 +1813,8 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags)
 {
 	unsigned long (*get_area)(struct file *, unsigned long,
-				  unsigned long, unsigned long, unsigned long);
+				  unsigned long, unsigned long, unsigned long)
+				  = NULL;
 
 	unsigned long error = arch_mmap_check(addr, len, flags);
 	if (error)
@@ -1823,7 +1824,6 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	get_area = current->mm->get_unmapped_area;
 	if (file) {
 		if (file->f_op->get_unmapped_area)
 			get_area = file->f_op->get_unmapped_area;
@@ -1842,7 +1842,11 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 	if (!file)
 		pgoff = 0;
 
-	addr = get_area(file, addr, len, pgoff, flags);
+	if (get_area)
+		addr = get_area(file, addr, len, pgoff, flags);
+	else
+		addr = mm_get_unmapped_area(current->mm, file, addr, len,
+					    pgoff, flags);
 	if (IS_ERR_VALUE(addr))
 		return addr;
 
@@ -1857,6 +1861,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 
 EXPORT_SYMBOL(get_unmapped_area);
 
+unsigned long
+mm_get_unmapped_area(struct mm_struct *mm, struct file *file,
+		     unsigned long addr, unsigned long len,
+		     unsigned long pgoff, unsigned long flags)
+{
+	if (test_bit(MMF_TOPDOWN, &mm->flags))
+		return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags);
+	return arch_get_unmapped_area(file, addr, len, pgoff, flags);
+}
+EXPORT_SYMBOL(mm_get_unmapped_area);
+
 /**
  * find_vma_intersection() - Look up the first VMA which intersects the interval
  * @mm: The process address space.
diff --git a/mm/shmem.c b/mm/shmem.c
index 0aad0d9a621b..4078c3a1b2d0 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2273,8 +2273,6 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 				      unsigned long uaddr, unsigned long len,
 				      unsigned long pgoff, unsigned long flags)
 {
-	unsigned long (*get_area)(struct file *,
-		unsigned long, unsigned long, unsigned long, unsigned long);
 	unsigned long addr;
 	unsigned long offset;
 	unsigned long inflated_len;
@@ -2284,8 +2282,8 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	get_area = current->mm->get_unmapped_area;
-	addr = get_area(file, uaddr, len, pgoff, flags);
+	addr = mm_get_unmapped_area(current->mm, file, uaddr, len, pgoff,
+				    flags);
 
 	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
 		return addr;
@@ -2342,7 +2340,8 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 	if (inflated_len < len)
 		return addr;
 
-	inflated_addr = get_area(NULL, uaddr, inflated_len, 0, flags);
+	inflated_addr = mm_get_unmapped_area(current->mm, NULL, uaddr,
+					     inflated_len, 0, flags);
 	if (IS_ERR_VALUE(inflated_addr))
 		return addr;
 	if (inflated_addr & ~PAGE_MASK)
@@ -4807,7 +4806,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 				      unsigned long addr, unsigned long len,
 				      unsigned long pgoff, unsigned long flags)
 {
-	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
 }
 #endif
 
diff --git a/mm/util.c b/mm/util.c
index 669397235787..8619d353a1aa 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -469,17 +469,17 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 
 	if (mmap_is_legacy(rlim_stack)) {
 		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-		mm->get_unmapped_area = arch_get_unmapped_area;
+		clear_bit(MMF_TOPDOWN, &mm->flags);
 	} else {
 		mm->mmap_base = mmap_base(random_factor, rlim_stack);
-		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+		set_bit(MMF_TOPDOWN, &mm->flags);
 	}
 }
 #elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT)
 void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 {
 	mm->mmap_base = TASK_UNMAPPED_BASE;
-	mm->get_unmapped_area = arch_get_unmapped_area;
+	clear_bit(MMF_TOPDOWN, &mm->flags);
 }
 #endif
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 10/14] treewide: Use initializer for struct vm_unmapped_area_info
       [not found] <20240326021656.202649-1-rick.p.edgecombe@intel.com>
  2024-03-26  2:16 ` [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag Rick Edgecombe
@ 2024-03-26  2:16 ` Rick Edgecombe
  1 sibling, 0 replies; 9+ messages in thread
From: Rick Edgecombe @ 2024-03-26  2:16 UTC (permalink / raw)
  To: Liam.Howlett, akpm, bp, broonie, christophe.leroy, dave.hansen,
	debug, hpa, keescook, kirill.shutemov, luto, mingo, peterz, tglx,
	x86
  Cc: rick.p.edgecombe, linux-kernel, linux-mm, linux-alpha,
	linux-snps-arc, linux-arm-kernel, linux-csky, loongarch,
	linux-mips, linux-s390, linux-sh, sparclinux

Future changes will need to add a new member to struct
vm_unmapped_area_info. This would cause trouble for any call site that
doesn't initialize the struct. Currently every caller sets each member
manually, so if new ones are added they will be uninitialized and the
core code parsing the struct will see garbage in the new member.

It could be possible to initialize the new member manually to 0 at each
call site. This and a couple other options were discussed. Having some
struct vm_unmapped_area_info instances not zero initialized will put
those sites at risk of feeding garbage into vm_unmapped_area(), if the
convention is to zero initialize the struct and any new field addition
missed a call site that initializes each field manually. So it is
useful to do things similar across the kernel.

The consensus (see links) was that in general the best way to accomplish
taking into account both code cleanliness and minimizing the chance of
introducing bugs, was to do C99 static initialization. As in:
struct vm_unmapped_area_info info = {};

With this method of initialization, the whole struct will be zero
initialized, and any statements setting fields to zero will be unneeded.
The change should not leave cleanup at the call sides.

While iterating though the possible solutions a few archs kindly acked
other variations that still zero initialized the struct. These sites have
been modified in previous changes using the pattern acked by the respective
arch.

So to be reduce the chance of bugs via uninitialized fields, perform a
tree wide change using the consensus for the best general way to do this
change. Use C99 static initializing to zero the struct and remove and
statements that simply set members to zero.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: linux-mm@kvack.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: loongarch@lists.linux.dev
Cc: linux-mips@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t
Link: https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcghhfh3rhmzi@wzlcoecgy5rs/
Link: https://lore.kernel.org/lkml/ec3e377a-c0a0-4dd3-9cb9-96517e54d17e@csgroup.eu/
---
v4:
 - Trivial rebase conflict in s390

Hi archs,

For some context, this is part of a larger series to improve shadow stack
guard gaps. It involves plumbing a new field via
struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
likely use it as well. The change is compile tested only for non-x86.

Thanks,

Rick
---
 arch/alpha/kernel/osf_sys.c      | 5 +----
 arch/arc/mm/mmap.c               | 4 +---
 arch/arm/mm/mmap.c               | 5 ++---
 arch/loongarch/mm/mmap.c         | 3 +--
 arch/mips/mm/mmap.c              | 3 +--
 arch/s390/mm/hugetlbpage.c       | 7 ++-----
 arch/s390/mm/mmap.c              | 5 ++---
 arch/sh/mm/mmap.c                | 5 ++---
 arch/sparc/kernel/sys_sparc_32.c | 3 +--
 arch/sparc/kernel/sys_sparc_64.c | 5 ++---
 arch/sparc/mm/hugetlbpage.c      | 7 ++-----
 arch/x86/kernel/sys_x86_64.c     | 7 ++-----
 arch/x86/mm/hugetlbpage.c        | 7 ++-----
 fs/hugetlbfs/inode.c             | 7 ++-----
 mm/mmap.c                        | 9 ++-------
 15 files changed, 25 insertions(+), 57 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..e5f881bc8288 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,14 +1218,11 @@ static unsigned long
 arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
 		         unsigned long limit)
 {
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = addr;
 	info.high_limit = limit;
-	info.align_mask = 0;
-	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 
diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
index 3c1c7ae73292..69a915297155 100644
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/*
 	 * We enforce the MAP_FIXED case.
@@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
-	info.align_mask = 0;
 	info.align_offset = pgoff << PAGE_SHIFT;
 	return vm_unmapped_area(&info);
 }
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index a0f8a0ca0788..d65d0e6ed10a 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct vm_area_struct *vma;
 	int do_align = 0;
 	int aliasing = cache_is_vipt_aliasing();
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/*
 	 * We only need to do colour alignment if either the I or D
@@ -68,7 +68,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
@@ -87,7 +86,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	unsigned long addr = addr0;
 	int do_align = 0;
 	int aliasing = cache_is_vipt_aliasing();
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/*
 	 * We only need to do colour alignment if either the I or D
diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
index a9630a81b38a..4bbd449b4a47 100644
--- a/arch/loongarch/mm/mmap.c
+++ b/arch/loongarch/mm/mmap.c
@@ -24,7 +24,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 	struct vm_area_struct *vma;
 	unsigned long addr = addr0;
 	int do_color_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (unlikely(len > TASK_SIZE))
 		return -ENOMEM;
@@ -82,7 +82,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 		 */
 	}
 
-	info.flags = 0;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
 	return vm_unmapped_area(&info);
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 00fe90c6db3e..7e11d7b58761 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 	struct vm_area_struct *vma;
 	unsigned long addr = addr0;
 	int do_color_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (unlikely(len > TASK_SIZE))
 		return -ENOMEM;
@@ -92,7 +92,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 		 */
 	}
 
-	info.flags = 0;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
 	return vm_unmapped_area(&info);
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index 219d906fe830..46de7a4c0309 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -258,14 +258,12 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
 		unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = current->mm->mmap_base;
 	info.high_limit = TASK_SIZE;
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 
@@ -274,7 +272,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
 		unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 	unsigned long addr;
 
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
@@ -282,7 +280,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = current->mm->mmap_base;
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index 6b2e4436ad4a..206756946589 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -86,7 +86,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (len > TASK_SIZE - mmap_min_addr)
 		return -ENOMEM;
@@ -102,7 +102,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			goto check_asce_limit;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = TASK_SIZE;
@@ -122,7 +121,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
 {
 	struct vm_area_struct *vma;
 	struct mm_struct *mm = current->mm;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE - mmap_min_addr)
diff --git a/arch/sh/mm/mmap.c b/arch/sh/mm/mmap.c
index b82199878b45..bee329d4149a 100644
--- a/arch/sh/mm/mmap.c
+++ b/arch/sh/mm/mmap.c
@@ -57,7 +57,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
 	int do_colour_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
@@ -88,7 +88,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = TASK_UNMAPPED_BASE;
 	info.high_limit = TASK_SIZE;
@@ -106,7 +105,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
 	int do_colour_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
diff --git a/arch/sparc/kernel/sys_sparc_32.c b/arch/sparc/kernel/sys_sparc_32.c
index 082a551897ed..08a19727795c 100644
--- a/arch/sparc/kernel/sys_sparc_32.c
+++ b/arch/sparc/kernel/sys_sparc_32.c
@@ -41,7 +41,7 @@ SYSCALL_DEFINE0(getpagesize)
 
 unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags)
 {
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
@@ -59,7 +59,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	if (!addr)
 		addr = TASK_UNMAPPED_BASE;
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = addr;
 	info.high_limit = TASK_SIZE;
diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 1dbf7211666e..d9c3b34ca744 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -93,7 +93,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 	struct vm_area_struct * vma;
 	unsigned long task_size = TASK_SIZE;
 	int do_color_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (flags & MAP_FIXED) {
 		/* We do not accept a shared mapping if it would violate
@@ -126,7 +126,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = TASK_UNMAPPED_BASE;
 	info.high_limit = min(task_size, VA_EXCLUDE_START);
@@ -154,7 +153,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	unsigned long task_size = STACK_TOP32;
 	unsigned long addr = addr0;
 	int do_color_align;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/* This should only ever run for 32-bit processes.  */
 	BUG_ON(!test_thread_flag(TIF_32BIT));
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index 38a1bef47efb..4caf56b32e26 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -31,17 +31,15 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *filp,
 {
 	struct hstate *h = hstate_file(filp);
 	unsigned long task_size = TASK_SIZE;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	if (test_thread_flag(TIF_32BIT))
 		task_size = STACK_TOP32;
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = TASK_UNMAPPED_BASE;
 	info.high_limit = min(task_size, VA_EXCLUDE_START);
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	if ((addr & ~PAGE_MASK) && task_size > VA_EXCLUDE_END) {
@@ -63,7 +61,7 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	struct hstate *h = hstate_file(filp);
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/* This should only ever run for 32-bit processes.  */
 	BUG_ON(!test_thread_flag(TIF_32BIT));
@@ -73,7 +71,6 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = mm->mmap_base;
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index cb9fa1d5c66f..96b9d29aead0 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -118,7 +118,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 	unsigned long begin, end;
 
 	if (flags & MAP_FIXED)
@@ -137,11 +137,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = begin;
 	info.high_limit = end;
-	info.align_mask = 0;
 	info.align_offset = pgoff << PAGE_SHIFT;
 	if (filp) {
 		info.align_mask = get_align_mask();
@@ -158,7 +156,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	struct vm_area_struct *vma;
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE)
@@ -203,7 +201,6 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall())
 		info.high_limit += TASK_SIZE_MAX - DEFAULT_MAP_WINDOW;
 
-	info.align_mask = 0;
 	info.align_offset = pgoff << PAGE_SHIFT;
 	if (filp) {
 		info.align_mask = get_align_mask();
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 6d77c0039617..fb600949a355 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -51,9 +51,8 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
 		unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = get_mmap_base(1);
 
@@ -65,7 +64,6 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
 		task_size_32bit() : task_size_64bit(addr > DEFAULT_MAP_WINDOW);
 
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 
@@ -74,7 +72,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
 		unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
 	info.length = len;
@@ -89,7 +87,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
 		info.high_limit += TASK_SIZE_MAX - DEFAULT_MAP_WINDOW;
 
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 3dee18bf47ed..2f4e88552d3f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -176,14 +176,12 @@ hugetlb_get_unmapped_area_bottomup(struct file *file, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = current->mm->mmap_base;
 	info.high_limit = arch_get_mmap_end(addr, len, flags);
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 
@@ -192,14 +190,13 @@ hugetlb_get_unmapped_area_topdown(struct file *file, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct hstate *h = hstate_file(file);
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
 	info.length = len;
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
diff --git a/mm/mmap.c b/mm/mmap.c
index f734e4fa6d94..609c087bba8e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1705,7 +1705,7 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr,
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma, *prev;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 	const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
 
 	if (len > mmap_end - mmap_min_addr)
@@ -1723,12 +1723,9 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr,
 			return addr;
 	}
 
-	info.flags = 0;
 	info.length = len;
 	info.low_limit = mm->mmap_base;
 	info.high_limit = mmap_end;
-	info.align_mask = 0;
-	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 
@@ -1753,7 +1750,7 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
 {
 	struct vm_area_struct *vma, *prev;
 	struct mm_struct *mm = current->mm;
-	struct vm_unmapped_area_info info;
+	struct vm_unmapped_area_info info = {};
 	const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
 
 	/* requested length too big for entire address space */
@@ -1777,8 +1774,6 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
 	info.length = len;
 	info.low_limit = PAGE_SIZE;
 	info.high_limit = arch_get_mmap_base(addr, mm->mmap_base);
-	info.align_mask = 0;
-	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-26  2:16 ` [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag Rick Edgecombe
@ 2024-03-26  3:32   ` Alexei Starovoitov
  2024-03-26 11:57   ` Jarkko Sakkinen
  2024-03-27  6:38   ` Dan Williams
  2 siblings, 0 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2024-03-26  3:32 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: Liam.Howlett, Andrew Morton, Borislav Petkov, Mark Brown,
	Christophe Leroy, Dave Hansen, debug, H. Peter Anvin, Kees Cook,
	Kirill A. Shutemov, Andy Lutomirski, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, X86 ML, LKML, linux-mm, linux-s390, sparclinux,
	linux-sgx, nvdimm, linux-cxl, Linux-Fsdevel, io-uring, bpf

On Mon, Mar 25, 2024 at 7:17 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
>
> diff --git a/mm/util.c b/mm/util.c
> index 669397235787..8619d353a1aa 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -469,17 +469,17 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>
>         if (mmap_is_legacy(rlim_stack)) {
>                 mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
> -               mm->get_unmapped_area = arch_get_unmapped_area;
> +               clear_bit(MMF_TOPDOWN, &mm->flags);
>         } else {
>                 mm->mmap_base = mmap_base(random_factor, rlim_stack);
> -               mm->get_unmapped_area = arch_get_unmapped_area_topdown;
> +               set_bit(MMF_TOPDOWN, &mm->flags);
>         }
>  }
>  #elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT)
>  void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>  {
>         mm->mmap_base = TASK_UNMAPPED_BASE;
> -       mm->get_unmapped_area = arch_get_unmapped_area;
> +       clear_bit(MMF_TOPDOWN, &mm->flags);
>  }
>  #endif

Makes sense to me.
Acked-by: Alexei Starovoitov <ast@kernel.org>
for the idea and for bpf bits.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-26  2:16 ` [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag Rick Edgecombe
  2024-03-26  3:32   ` Alexei Starovoitov
@ 2024-03-26 11:57   ` Jarkko Sakkinen
  2024-03-27  2:42     ` Edgecombe, Rick P
  2024-03-27  6:38   ` Dan Williams
  2 siblings, 1 reply; 9+ messages in thread
From: Jarkko Sakkinen @ 2024-03-26 11:57 UTC (permalink / raw)
  To: Rick Edgecombe, Liam.Howlett, akpm, bp, broonie, christophe.leroy,
	dave.hansen, debug, hpa, keescook, kirill.shutemov, luto, mingo,
	peterz, tglx, x86
  Cc: linux-kernel, linux-mm, linux-s390, sparclinux, linux-sgx, nvdimm,
	linux-cxl, linux-fsdevel, io-uring, bpf

On Tue Mar 26, 2024 at 4:16 AM EET, Rick Edgecombe wrote:
> The mm_struct contains a function pointer *get_unmapped_area(), which
> is set to either arch_get_unmapped_area() or
> arch_get_unmapped_area_topdown() during the initialization of the mm.

In which conditions which path is used during the initialization of mm
and why is this the case? It is an open claim in the current form.

That would be nice to have documented for the sake of being complete
description. I have zero doubts of the claim being untrue.

BR, Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-26 11:57   ` Jarkko Sakkinen
@ 2024-03-27  2:42     ` Edgecombe, Rick P
  2024-03-27 13:15       ` Jarkko Sakkinen
  0 siblings, 1 reply; 9+ messages in thread
From: Edgecombe, Rick P @ 2024-03-27  2:42 UTC (permalink / raw)
  To: keescook@chromium.org, luto@kernel.org,
	dave.hansen@linux.intel.com, debug@rivosinc.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	christophe.leroy@csgroup.eu, tglx@linutronix.de,
	jarkko@kernel.org, hpa@zytor.com, peterz@infradead.org,
	bp@alien8.de, x86@kernel.org, broonie@kernel.org
  Cc: linux-sgx@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-mm@kvack.org, linux-cxl@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org,
	io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	nvdimm@lists.linux.dev, bpf@vger.kernel.org

On Tue, 2024-03-26 at 13:57 +0200, Jarkko Sakkinen wrote:
> In which conditions which path is used during the initialization of mm
> and why is this the case? It is an open claim in the current form.

There is an arch_pick_mmap_layout() that arch's can have their own rules for. There is also a
generic one. It gets called during exec.

> 
> That would be nice to have documented for the sake of being complete
> description. I have zero doubts of the claim being untrue.

...being untrue?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-26  2:16 ` [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag Rick Edgecombe
  2024-03-26  3:32   ` Alexei Starovoitov
  2024-03-26 11:57   ` Jarkko Sakkinen
@ 2024-03-27  6:38   ` Dan Williams
  2024-03-28  3:31     ` Edgecombe, Rick P
  2 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2024-03-27  6:38 UTC (permalink / raw)
  To: Rick Edgecombe, Liam.Howlett, akpm, bp, broonie, christophe.leroy,
	dave.hansen, debug, hpa, keescook, kirill.shutemov, luto, mingo,
	peterz, tglx, x86
  Cc: rick.p.edgecombe, linux-kernel, linux-mm, linux-s390, sparclinux,
	linux-sgx, nvdimm, linux-cxl, linux-fsdevel, io-uring, bpf

Rick Edgecombe wrote:
> The mm_struct contains a function pointer *get_unmapped_area(), which
> is set to either arch_get_unmapped_area() or
> arch_get_unmapped_area_topdown() during the initialization of the mm.
> 
> Since the function pointer only ever points to two functions that are named
> the same across all arch's, a function pointer is not really required. In
> addition future changes will want to add versions of the functions that
> take additional arguments. So to save a pointers worth of bytes in
> mm_struct, and prevent adding additional function pointers to mm_struct in
> future changes, remove it and keep the information about which
> get_unmapped_area() to use in a flag.
> 
> Add the new flag to MMF_INIT_MASK so it doesn't get clobbered on fork by
> mmf_init_flags(). Most MM flags get clobbered on fork. In the pre-existing
> behavior mm->get_unmapped_area() would get copied to the new mm in
> dup_mm(), so not clobbering the flag preserves the existing behavior
> around inheriting the topdown-ness.
> 
> Introduce a helper, mm_get_unmapped_area(), to easily convert code that
> refers to the old function pointer to instead select and call either
> arch_get_unmapped_area() or arch_get_unmapped_area_topdown() based on the
> flag. Then drop the mm->get_unmapped_area() function pointer. Leave the
> get_unmapped_area() pointer in struct file_operations alone. The main
> purpose of this change is to reorganize in preparation for future changes,
> but it also converts the calls of mm->get_unmapped_area() from indirect
> branches into a direct ones.
> 
> The stress-ng bigheap benchmark calls realloc a lot, which calls through
> get_unmapped_area() in the kernel. On x86, the change yielded a ~1%
> improvement there on a retpoline config.
> 
> In testing a few x86 configs, removing the pointer unfortunately didn't
> result in any actual size reductions in the compiled layout of mm_struct.
> But depending on compiler or arch alignment requirements, the change could
> shrink the size of mm_struct.
> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: linux-s390@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sparclinux@vger.kernel.org
> Cc: linux-sgx@vger.kernel.org
> Cc: nvdimm@lists.linux.dev
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: io-uring@vger.kernel.org
> Cc: bpf@vger.kernel.org
> ---
> v4:
>  - Split out pde_get_unmapped_area() refactor into separate patch
>    (Christophe Leroy)
> 
> v3:
>  - Fix comment that still referred to mm->get_unmapped_area()
>  - Resolve trivial rebase conflicts with "mm: thp_get_unmapped_area must
>    honour topdown preference"
>  - Spelling fix in log
> 
> v2:
>  - Fix comment on MMF_TOPDOWN (Kirill, rppt)
>  - Move MMF_TOPDOWN to actually unused bit
>  - Add MMF_TOPDOWN to MMF_INIT_MASK so it doesn't get clobbered on fork,
>    and result in the children using the search up path.
>  - New lower performance results after above bug fix
>  - Add Reviews and Acks
> ---
>  arch/s390/mm/hugetlbpage.c       |  2 +-
>  arch/s390/mm/mmap.c              |  4 ++--
>  arch/sparc/kernel/sys_sparc_64.c | 15 ++++++---------
>  arch/sparc/mm/hugetlbpage.c      |  2 +-
>  arch/x86/kernel/cpu/sgx/driver.c |  2 +-
>  arch/x86/mm/hugetlbpage.c        |  2 +-
>  arch/x86/mm/mmap.c               |  4 ++--
>  drivers/char/mem.c               |  2 +-
>  drivers/dax/device.c             |  6 +++---
>  fs/hugetlbfs/inode.c             |  4 ++--
>  fs/proc/inode.c                  |  3 ++-
>  fs/ramfs/file-mmu.c              |  2 +-
>  include/linux/mm_types.h         |  6 +-----
>  include/linux/sched/coredump.h   |  5 ++++-
>  include/linux/sched/mm.h         |  5 +++++
>  io_uring/io_uring.c              |  2 +-
>  kernel/bpf/arena.c               |  2 +-
>  kernel/bpf/syscall.c             |  2 +-
>  mm/debug.c                       |  6 ------
>  mm/huge_memory.c                 |  9 ++++-----
>  mm/mmap.c                        | 21 ++++++++++++++++++---
>  mm/shmem.c                       | 11 +++++------
>  mm/util.c                        |  6 +++---
>  23 files changed, 66 insertions(+), 57 deletions(-)
> 
> diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
> index c2e8242bd15d..219d906fe830 100644
> --- a/arch/s390/mm/hugetlbpage.c
> +++ b/arch/s390/mm/hugetlbpage.c
> @@ -328,7 +328,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  			goto check_asce_limit;
>  	}
>  
> -	if (mm->get_unmapped_area == arch_get_unmapped_area)
> +	if (!test_bit(MMF_TOPDOWN, &mm->flags))
>  		addr = hugetlb_get_unmapped_area_bottomup(file, addr, len,
>  				pgoff, flags);
>  	else
> diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
> index b14fc0887654..6b2e4436ad4a 100644
> --- a/arch/s390/mm/mmap.c
> +++ b/arch/s390/mm/mmap.c
> @@ -185,10 +185,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>  	 */
>  	if (mmap_is_legacy(rlim_stack)) {
>  		mm->mmap_base = mmap_base_legacy(random_factor);
> -		mm->get_unmapped_area = arch_get_unmapped_area;
> +		clear_bit(MMF_TOPDOWN, &mm->flags);
>  	} else {
>  		mm->mmap_base = mmap_base(random_factor, rlim_stack);
> -		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
> +		set_bit(MMF_TOPDOWN, &mm->flags);
>  	}
>  }
>  
> diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
> index 1e9a9e016237..1dbf7211666e 100644
> --- a/arch/sparc/kernel/sys_sparc_64.c
> +++ b/arch/sparc/kernel/sys_sparc_64.c
> @@ -218,14 +218,10 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
>  unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, unsigned long len, unsigned long pgoff, unsigned long flags)
>  {
>  	unsigned long align_goal, addr = -ENOMEM;
> -	unsigned long (*get_area)(struct file *, unsigned long,
> -				  unsigned long, unsigned long, unsigned long);
> -
> -	get_area = current->mm->get_unmapped_area;
>  
>  	if (flags & MAP_FIXED) {
>  		/* Ok, don't mess with it. */
> -		return get_area(NULL, orig_addr, len, pgoff, flags);
> +		return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);
>  	}
>  	flags &= ~MAP_SHARED;
>  
> @@ -238,7 +234,8 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
>  		align_goal = (64UL * 1024);
>  
>  	do {
> -		addr = get_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff, flags);
> +		addr = mm_get_unmapped_area(current->mm, NULL, orig_addr,
> +					    len + (align_goal - PAGE_SIZE), pgoff, flags);
>  		if (!(addr & ~PAGE_MASK)) {
>  			addr = (addr + (align_goal - 1UL)) & ~(align_goal - 1UL);
>  			break;
> @@ -256,7 +253,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
>  	 * be obtained.
>  	 */
>  	if (addr & ~PAGE_MASK)
> -		addr = get_area(NULL, orig_addr, len, pgoff, flags);
> +		addr = mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);
>  
>  	return addr;
>  }
> @@ -292,7 +289,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>  	    gap == RLIM_INFINITY ||
>  	    sysctl_legacy_va_layout) {
>  		mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
> -		mm->get_unmapped_area = arch_get_unmapped_area;
> +		clear_bit(MMF_TOPDOWN, &mm->flags);
>  	} else {
>  		/* We know it's 32-bit */
>  		unsigned long task_size = STACK_TOP32;
> @@ -303,7 +300,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>  			gap = (task_size / 6 * 5);
>  
>  		mm->mmap_base = PAGE_ALIGN(task_size - gap - random_factor);
> -		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
> +		set_bit(MMF_TOPDOWN, &mm->flags);
>  	}
>  }
>  
> diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
> index b432500c13a5..38a1bef47efb 100644
> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -123,7 +123,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  		    (!vma || addr + len <= vm_start_gap(vma)))
>  			return addr;
>  	}
> -	if (mm->get_unmapped_area == arch_get_unmapped_area)
> +	if (!test_bit(MMF_TOPDOWN, &mm->flags))
>  		return hugetlb_get_unmapped_area_bottomup(file, addr, len,
>  				pgoff, flags);
>  	else
> diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
> index 262f5fb18d74..22b65a5f5ec6 100644
> --- a/arch/x86/kernel/cpu/sgx/driver.c
> +++ b/arch/x86/kernel/cpu/sgx/driver.c
> @@ -113,7 +113,7 @@ static unsigned long sgx_get_unmapped_area(struct file *file,
>  	if (flags & MAP_FIXED)
>  		return addr;
>  
> -	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
>  }
>  
>  #ifdef CONFIG_COMPAT
> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> index 5804bbae4f01..6d77c0039617 100644
> --- a/arch/x86/mm/hugetlbpage.c
> +++ b/arch/x86/mm/hugetlbpage.c
> @@ -141,7 +141,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  	}
>  
>  get_unmapped_area:
> -	if (mm->get_unmapped_area == arch_get_unmapped_area)
> +	if (!test_bit(MMF_TOPDOWN, &mm->flags))
>  		return hugetlb_get_unmapped_area_bottomup(file, addr, len,
>  				pgoff, flags);
>  	else
> diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
> index c90c20904a60..a2cabb1c81e1 100644
> --- a/arch/x86/mm/mmap.c
> +++ b/arch/x86/mm/mmap.c
> @@ -129,9 +129,9 @@ static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
>  void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
>  {
>  	if (mmap_is_legacy())
> -		mm->get_unmapped_area = arch_get_unmapped_area;
> +		clear_bit(MMF_TOPDOWN, &mm->flags);
>  	else
> -		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
> +		set_bit(MMF_TOPDOWN, &mm->flags);
>  
>  	arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base,
>  			arch_rnd(mmap64_rnd_bits), task_size_64bit(0),
> diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> index 3c6670cf905f..9b80e622ae80 100644
> --- a/drivers/char/mem.c
> +++ b/drivers/char/mem.c
> @@ -544,7 +544,7 @@ static unsigned long get_unmapped_area_zero(struct file *file,
>  	}
>  
>  	/* Otherwise flags & MAP_PRIVATE: with no shmem object beneath it */
> -	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
>  #else
>  	return -ENOSYS;
>  #endif
> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
> index 93ebedc5ec8c..47c126d37b59 100644
> --- a/drivers/dax/device.c
> +++ b/drivers/dax/device.c
> @@ -329,14 +329,14 @@ static unsigned long dax_get_unmapped_area(struct file *filp,
>  	if ((off + len_align) < off)
>  		goto out;
>  
> -	addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
> -			pgoff, flags);
> +	addr_align = mm_get_unmapped_area(current->mm, filp, addr, len_align,
> +					  pgoff, flags);
>  	if (!IS_ERR_VALUE(addr_align)) {
>  		addr_align += (off - addr_align) & (align - 1);
>  		return addr_align;
>  	}
>   out:
> -	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
>  }
>  
>  static const struct address_space_operations dev_dax_aops = {
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 6502c7e776d1..3dee18bf47ed 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -249,11 +249,11 @@ generic_hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  	}
>  
>  	/*
> -	 * Use mm->get_unmapped_area value as a hint to use topdown routine.
> +	 * Use MMF_TOPDOWN flag as a hint to use topdown routine.
>  	 * If architectures have special needs, they should define their own
>  	 * version of hugetlb_get_unmapped_area.
>  	 */
> -	if (mm->get_unmapped_area == arch_get_unmapped_area_topdown)
> +	if (test_bit(MMF_TOPDOWN, &mm->flags))
>  		return hugetlb_get_unmapped_area_topdown(file, addr, len,
>  				pgoff, flags);
>  	return hugetlb_get_unmapped_area_bottomup(file, addr, len,
> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
> index 75396a24fd8c..d19434e2a58e 100644
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -455,8 +455,9 @@ pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned lo
>  		return pde->proc_ops->proc_get_unmapped_area(file, orig_addr, len, pgoff, flags);
>  
>  #ifdef CONFIG_MMU
> -	return current->mm->get_unmapped_area(file, orig_addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, file, orig_addr, len, pgoff, flags);
>  #endif
> +
>  	return orig_addr;
>  }
>  
> diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
> index c7a1aa3c882b..b45c7edc3225 100644
> --- a/fs/ramfs/file-mmu.c
> +++ b/fs/ramfs/file-mmu.c
> @@ -35,7 +35,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct file *file,
>  		unsigned long addr, unsigned long len, unsigned long pgoff,
>  		unsigned long flags)
>  {
> -	return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
>  }
>  
>  const struct file_operations ramfs_file_operations = {
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5240bd7bca33..9313e43123d4 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -777,11 +777,7 @@ struct mm_struct {
>  		} ____cacheline_aligned_in_smp;
>  
>  		struct maple_tree mm_mt;
> -#ifdef CONFIG_MMU
> -		unsigned long (*get_unmapped_area) (struct file *filp,
> -				unsigned long addr, unsigned long len,
> -				unsigned long pgoff, unsigned long flags);
> -#endif
> +
>  		unsigned long mmap_base;	/* base of mmap area */
>  		unsigned long mmap_legacy_base;	/* base of mmap area in bottom-up allocations */
>  #ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> index 02f5090ffea2..e62ff805cfc9 100644
> --- a/include/linux/sched/coredump.h
> +++ b/include/linux/sched/coredump.h
> @@ -92,9 +92,12 @@ static inline int get_dumpable(struct mm_struct *mm)
>  #define MMF_VM_MERGE_ANY	30
>  #define MMF_VM_MERGE_ANY_MASK	(1 << MMF_VM_MERGE_ANY)
>  
> +#define MMF_TOPDOWN		31	/* mm searches top down by default */
> +#define MMF_TOPDOWN_MASK	(1 << MMF_TOPDOWN)
> +
>  #define MMF_INIT_MASK		(MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
>  				 MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\
> -				 MMF_VM_MERGE_ANY_MASK)
> +				 MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK)
>  
>  static inline unsigned long mmf_init_flags(unsigned long flags)
>  {
> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
> index b6543f9d78d6..ed1caa26c8be 100644
> --- a/include/linux/sched/mm.h
> +++ b/include/linux/sched/mm.h
> @@ -8,6 +8,7 @@
>  #include <linux/mm_types.h>
>  #include <linux/gfp.h>
>  #include <linux/sync_core.h>
> +#include <linux/sched/coredump.h>
>  
>  /*
>   * Routines for handling mm_structs
> @@ -186,6 +187,10 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
>  			  unsigned long len, unsigned long pgoff,
>  			  unsigned long flags);
>  
> +unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp,
> +				   unsigned long addr, unsigned long len,
> +				   unsigned long pgoff, unsigned long flags);
> +
>  unsigned long
>  generic_get_unmapped_area(struct file *filp, unsigned long addr,
>  			  unsigned long len, unsigned long pgoff,
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 5d4b448fdc50..405bab0a560c 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -3520,7 +3520,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
>  #else
>  	addr = 0UL;
>  #endif
> -	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
>  }
>  
>  #else /* !CONFIG_MMU */
> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
> index 86571e760dd6..74d566dcd2cb 100644
> --- a/kernel/bpf/arena.c
> +++ b/kernel/bpf/arena.c
> @@ -314,7 +314,7 @@ static unsigned long arena_get_unmapped_area(struct file *filp, unsigned long ad
>  			return -EINVAL;
>  	}
>  
> -	ret = current->mm->get_unmapped_area(filp, addr, len * 2, 0, flags);
> +	ret = mm_get_unmapped_area(current->mm, filp, addr, len * 2, 0, flags);
>  	if (IS_ERR_VALUE(ret))
>  		return ret;
>  	if ((ret >> 32) == ((ret + len - 1) >> 32))
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index ae2ff73bde7e..dead5e1977d8 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -980,7 +980,7 @@ static unsigned long bpf_get_unmapped_area(struct file *filp, unsigned long addr
>  	if (map->ops->map_get_unmapped_area)
>  		return map->ops->map_get_unmapped_area(filp, addr, len, pgoff, flags);
>  #ifdef CONFIG_MMU
> -	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
>  #else
>  	return addr;
>  #endif
> diff --git a/mm/debug.c b/mm/debug.c
> index c1c1a6a484e4..37a17f77df9f 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -180,9 +180,6 @@ EXPORT_SYMBOL(dump_vma);
>  void dump_mm(const struct mm_struct *mm)
>  {
>  	pr_emerg("mm %px task_size %lu\n"
> -#ifdef CONFIG_MMU
> -		"get_unmapped_area %px\n"
> -#endif
>  		"mmap_base %lu mmap_legacy_base %lu\n"
>  		"pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n"
>  		"hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n"
> @@ -208,9 +205,6 @@ void dump_mm(const struct mm_struct *mm)
>  		"def_flags: %#lx(%pGv)\n",
>  
>  		mm, mm->task_size,
> -#ifdef CONFIG_MMU
> -		mm->get_unmapped_area,
> -#endif
>  		mm->mmap_base, mm->mmap_legacy_base,
>  		mm->pgd, atomic_read(&mm->mm_users),
>  		atomic_read(&mm->mm_count),
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9859aa4f7553..cede9ccb84dc 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -824,8 +824,8 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
>  	if (len_pad < len || (off + len_pad) < off)
>  		return 0;
>  
> -	ret = current->mm->get_unmapped_area(filp, addr, len_pad,
> -					      off >> PAGE_SHIFT, flags);
> +	ret = mm_get_unmapped_area(current->mm, filp, addr, len_pad,
> +				   off >> PAGE_SHIFT, flags);
>  
>  	/*
>  	 * The failure might be due to length padding. The caller will retry
> @@ -843,8 +843,7 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
>  
>  	off_sub = (off - ret) & (size - 1);
>  
> -	if (current->mm->get_unmapped_area == arch_get_unmapped_area_topdown &&
> -	    !off_sub)
> +	if (test_bit(MMF_TOPDOWN, &current->mm->flags) && !off_sub)
>  		return ret + size;
>  
>  	ret += off_sub;
> @@ -861,7 +860,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>  	if (ret)
>  		return ret;
>  
> -	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> +	return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
>  }
>  EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
>  
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 6dbda99a47da..224e9ce1e2fd 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1813,7 +1813,8 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  		unsigned long pgoff, unsigned long flags)
>  {
>  	unsigned long (*get_area)(struct file *, unsigned long,
> -				  unsigned long, unsigned long, unsigned long);
> +				  unsigned long, unsigned long, unsigned long)
> +				  = NULL;
>  
>  	unsigned long error = arch_mmap_check(addr, len, flags);
>  	if (error)
> @@ -1823,7 +1824,6 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  	if (len > TASK_SIZE)
>  		return -ENOMEM;
>  
> -	get_area = current->mm->get_unmapped_area;
>  	if (file) {
>  		if (file->f_op->get_unmapped_area)
>  			get_area = file->f_op->get_unmapped_area;
> @@ -1842,7 +1842,11 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  	if (!file)
>  		pgoff = 0;
>  
> -	addr = get_area(file, addr, len, pgoff, flags);
> +	if (get_area)
> +		addr = get_area(file, addr, len, pgoff, flags);
> +	else
> +		addr = mm_get_unmapped_area(current->mm, file, addr, len,
> +					    pgoff, flags);
>  	if (IS_ERR_VALUE(addr))
>  		return addr;
>  
> @@ -1857,6 +1861,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  
>  EXPORT_SYMBOL(get_unmapped_area);
>  
> +unsigned long
> +mm_get_unmapped_area(struct mm_struct *mm, struct file *file,
> +		     unsigned long addr, unsigned long len,
> +		     unsigned long pgoff, unsigned long flags)
> +{

Seems like a small waste to have all call sites now need to push an
additional @mm argument onto the stack just to figure out what function
to call.

> +	if (test_bit(MMF_TOPDOWN, &mm->flags))
> +		return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags);
> +	return arch_get_unmapped_area(file, addr, len, pgoff, flags);

This seems small enough to be amenable to a static inline, but that
would require exporting the arch calls which might be painful.

Would it not be possible to drop the @mm argument and just reference
current->mm internal to this function? Just call this funcion
current_get_unmapped_area()?

Otherwise looks like a useful change to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-27  2:42     ` Edgecombe, Rick P
@ 2024-03-27 13:15       ` Jarkko Sakkinen
  2024-03-28  3:32         ` Edgecombe, Rick P
  0 siblings, 1 reply; 9+ messages in thread
From: Jarkko Sakkinen @ 2024-03-27 13:15 UTC (permalink / raw)
  To: Edgecombe, Rick P, keescook@chromium.org, luto@kernel.org,
	dave.hansen@linux.intel.com, debug@rivosinc.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	christophe.leroy@csgroup.eu, tglx@linutronix.de, hpa@zytor.com,
	peterz@infradead.org, bp@alien8.de, x86@kernel.org,
	broonie@kernel.org
  Cc: linux-sgx@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-mm@kvack.org, linux-cxl@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org,
	io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	nvdimm@lists.linux.dev, bpf@vger.kernel.org

On Wed, 2024-03-27 at 02:42 +0000, Edgecombe, Rick P wrote:
> On Tue, 2024-03-26 at 13:57 +0200, Jarkko Sakkinen wrote:
> > In which conditions which path is used during the initialization of
> > mm
> > and why is this the case? It is an open claim in the current form.
> 
> There is an arch_pick_mmap_layout() that arch's can have their own
> rules for. There is also a
> generic one. It gets called during exec.
> 
> > 
> > That would be nice to have documented for the sake of being
> > complete
> > description. I have zero doubts of the claim being untrue.
> 
> ...being untrue?
> 

I mean I believe the change itself makes sense, it is just not
fully documented in the commit message.

BR, Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-27  6:38   ` Dan Williams
@ 2024-03-28  3:31     ` Edgecombe, Rick P
  0 siblings, 0 replies; 9+ messages in thread
From: Edgecombe, Rick P @ 2024-03-28  3:31 UTC (permalink / raw)
  To: Williams, Dan J, keescook@chromium.org, luto@kernel.org,
	dave.hansen@linux.intel.com, debug@rivosinc.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	kirill.shutemov@linux.intel.com, mingo@redhat.com,
	christophe.leroy@csgroup.eu, tglx@linutronix.de, hpa@zytor.com,
	peterz@infradead.org, bp@alien8.de, x86@kernel.org,
	broonie@kernel.org
  Cc: linux-sgx@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-mm@kvack.org, linux-cxl@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org,
	io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	nvdimm@lists.linux.dev, bpf@vger.kernel.org

On Tue, 2024-03-26 at 23:38 -0700, Dan Williams wrote:
> > +unsigned long
> > +mm_get_unmapped_area(struct mm_struct *mm, struct file *file,
> > +                    unsigned long addr, unsigned long len,
> > +                    unsigned long pgoff, unsigned long flags)
> > +{
> 
> Seems like a small waste to have all call sites now need to push an
> additional @mm argument onto the stack just to figure out what function
> to call.
> 
> > +       if (test_bit(MMF_TOPDOWN, &mm->flags))
> > +               return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags);
> > +       return arch_get_unmapped_area(file, addr, len, pgoff, flags);
> 
> This seems small enough to be amenable to a static inline, but that
> would require exporting the arch calls which might be painful.
> 
> Would it not be possible to drop the @mm argument and just reference
> current->mm internal to this function? Just call this funcion
> current_get_unmapped_area()?

Hmm, you are right. The callers all pass current->mm. The mm argument could be removed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag
  2024-03-27 13:15       ` Jarkko Sakkinen
@ 2024-03-28  3:32         ` Edgecombe, Rick P
  0 siblings, 0 replies; 9+ messages in thread
From: Edgecombe, Rick P @ 2024-03-28  3:32 UTC (permalink / raw)
  To: keescook@chromium.org, luto@kernel.org,
	dave.hansen@linux.intel.com, debug@rivosinc.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	mingo@redhat.com, kirill.shutemov@linux.intel.com,
	tglx@linutronix.de, christophe.leroy@csgroup.eu,
	jarkko@kernel.org, hpa@zytor.com, peterz@infradead.org,
	bp@alien8.de, x86@kernel.org, broonie@kernel.org
  Cc: linux-sgx@vger.kernel.org, io-uring@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-cxl@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev,
	bpf@vger.kernel.org, linux-mm@kvack.org

On Wed, 2024-03-27 at 15:15 +0200, Jarkko Sakkinen wrote:
> I mean I believe the change itself makes sense, it is just not
> fully documented in the commit message.

Ah, I see. Yes, there could be more background on arch_pick_mmap_layout().

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-03-28  3:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240326021656.202649-1-rick.p.edgecombe@intel.com>
2024-03-26  2:16 ` [PATCH v4 02/14] mm: Switch mm->get_unmapped_area() to a flag Rick Edgecombe
2024-03-26  3:32   ` Alexei Starovoitov
2024-03-26 11:57   ` Jarkko Sakkinen
2024-03-27  2:42     ` Edgecombe, Rick P
2024-03-27 13:15       ` Jarkko Sakkinen
2024-03-28  3:32         ` Edgecombe, Rick P
2024-03-27  6:38   ` Dan Williams
2024-03-28  3:31     ` Edgecombe, Rick P
2024-03-26  2:16 ` [PATCH v4 10/14] treewide: Use initializer for struct vm_unmapped_area_info Rick Edgecombe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox