linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v12 0/4] support large align and nid in Rust allocators
@ 2025-07-09 17:23 Vitaly Wool
  2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-09 17:23 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato, Vitaly Wool

The coming patches provide the ability for Rust allocators to set
NUMA node and large alignment.

Changelog:
v2 -> v3:
* fixed the build breakage for non-MMU configs
v3 -> v4:
* added NUMA node support for k[v]realloc (patch #2)
* removed extra logic in Rust helpers
* patch for Rust allocators split into 2 (align: patch #3 and
  NUMA ids: patch #4)
v4 -> v5:
* reworked NUMA node support for k[v]realloc for all 3 <alloc>_node
  functions to have the same signature
* all 3 <alloc>_node slab/vmalloc functions now support alignment
  specification
* Rust helpers are extended with new functions, the old ones are left
  intact
* Rust support for NUMA nodes comes first now (as patch #3)
v5 -> v6:
* added <alloc>_node_align functions to keep the existing interfaces
  intact
* clearer separation for Rust support of MUNA ids and large alignments
v6 -> v7:
* NUMA identifier as a new Rust type (NumaNode)
* better documentation for changed and new functions and constants
v7 -> v8:
* removed NumaError
* small cleanups per reviewers' comments
v8 -> v9:
* realloc functions can now reallocate memory for a different NUMA
  node
* better comments/explanations in the Rust part
v9 -> v10:
* refined behavior when memory is being reallocated for a different
  NUMA node, comments added
* cleanups in the Rust part, rustfmt ran
* typos corrected
v10 -> v11:
* added documentation for the NO_NODE constant
* added node parameter to Allocator's alloc/realloc instead of adding
  separate alloc_node resp. realloc_node functions, modified users of
  alloc/realloc in accordance with that
v11 -> v12:
* some redundant _noprof functions removed in patch 2/4
* c'n'p error fixed in patch 2/4 (vmalloc_to_page -> virt_to_page)
* some typo corrections and documentation updates, primarily in patch
  3/4

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 17:23 [PATCH v12 0/4] support large align and nid in Rust allocators Vitaly Wool
@ 2025-07-09 17:24 ` Vitaly Wool
  2025-07-09 19:01   ` Liam R. Howlett
  2025-07-09 22:53   ` Alexei Starovoitov
  2025-07-09 17:24 ` [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc Vitaly Wool
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-09 17:24 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato, Vitaly Wool

Reimplement vrealloc() to be able to set node and alignment should
a user need to do so. Rename the function to vrealloc_node_align()
to better match what it actually does now and introduce macros for
vrealloc() and friends for backward compatibility.

With that change we also provide the ability for the Rust part of
the kernel to set node and alignment in its allocations.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/vmalloc.h | 12 +++++++++---
 mm/nommu.c              |  3 ++-
 mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
 3 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index fdc9aeb74a44..68791f7cb3ba 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -197,9 +197,15 @@ extern void *__vcalloc_noprof(size_t n, size_t size, gfp_t flags) __alloc_size(1
 extern void *vcalloc_noprof(size_t n, size_t size) __alloc_size(1, 2);
 #define vcalloc(...)		alloc_hooks(vcalloc_noprof(__VA_ARGS__))
 
-void * __must_check vrealloc_noprof(const void *p, size_t size, gfp_t flags)
-		__realloc_size(2);
-#define vrealloc(...)		alloc_hooks(vrealloc_noprof(__VA_ARGS__))
+void *__must_check vrealloc_node_align_noprof(const void *p, size_t size,
+		unsigned long align, gfp_t flags, int nid) __realloc_size(2);
+#define vrealloc_node_noprof(_p, _s, _f, _nid)	\
+	vrealloc_node_align_noprof(_p, _s, 1, _f, _nid)
+#define vrealloc_noprof(_p, _s, _f)		\
+	vrealloc_node_align_noprof(_p, _s, 1, _f, NUMA_NO_NODE)
+#define vrealloc_node_align(...)		alloc_hooks(vrealloc_node_align_noprof(__VA_ARGS__))
+#define vrealloc_node(...)			alloc_hooks(vrealloc_node_noprof(__VA_ARGS__))
+#define vrealloc(...)				alloc_hooks(vrealloc_noprof(__VA_ARGS__))
 
 extern void vfree(const void *addr);
 extern void vfree_atomic(const void *addr);
diff --git a/mm/nommu.c b/mm/nommu.c
index 87e1acab0d64..8359b2025b9f 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -119,7 +119,8 @@ void *__vmalloc_noprof(unsigned long size, gfp_t gfp_mask)
 }
 EXPORT_SYMBOL(__vmalloc_noprof);
 
-void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
+void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
+				 gfp_t flags, int node)
 {
 	return krealloc_noprof(p, size, (flags | __GFP_COMP) & ~__GFP_HIGHMEM);
 }
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6dbcdceecae1..03dd06097b25 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
 EXPORT_SYMBOL(vzalloc_node_noprof);
 
 /**
- * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
+ * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
+ * remain unchanged
  * @p: object to reallocate memory for
  * @size: the size to reallocate
+ * @align: requested alignment
  * @flags: the flags for the page level allocator
+ * @nid: node number of the target node
+ *
+ * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
+ * 0 and @p is not a %NULL pointer, the object pointed to is freed.
  *
- * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
- * @p is not a %NULL pointer, the object pointed to is freed.
+ * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
+ * the given node. If reallocation is not necessary (e. g. the new size is less
+ * than the current allocated size), the current allocation will be preserved
+ * unless __GFP_THISNODE is set. In the latter case a new allocation on the
+ * requested node will be attempted.
  *
  * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
  * initial memory allocation, every subsequent call to this API for the same
  * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
  * __GFP_ZERO is not fully honored by this API.
  *
+ * If the requested alignment is bigger than the one the *existing* allocation
+ * has, this function will fail.
+ *
  * In any case, the contents of the object pointed to are preserved up to the
  * lesser of the new and old sizes.
  *
@@ -4111,7 +4123,8 @@ EXPORT_SYMBOL(vzalloc_node_noprof);
  * Return: pointer to the allocated memory; %NULL if @size is zero or in case of
  *         failure
  */
-void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
+void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
+				 gfp_t flags, int nid)
 {
 	struct vm_struct *vm = NULL;
 	size_t alloced_size = 0;
@@ -4135,6 +4148,12 @@ void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
 		if (WARN(alloced_size < old_size,
 			 "vrealloc() has mismatched area vs requested sizes (%p)\n", p))
 			return NULL;
+		if (WARN(!IS_ALIGNED((unsigned long)p, align),
+			 "will not reallocate with a bigger alignment (0x%lx)\n", align))
+			return NULL;
+		if (unlikely(flags & __GFP_THISNODE) && nid != NUMA_NO_NODE &&
+			     nid != page_to_nid(vmalloc_to_page(p)))
+			goto need_realloc;
 	}
 
 	/*
@@ -4165,8 +4184,10 @@ void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
 		return (void *)p;
 	}
 
+need_realloc:
 	/* TODO: Grow the vm_area, i.e. allocate and map additional pages. */
-	n = __vmalloc_noprof(size, flags);
+	n = __vmalloc_node_noprof(size, align, flags, nid, __builtin_return_address(0));
+
 	if (!n)
 		return NULL;
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-09 17:23 [PATCH v12 0/4] support large align and nid in Rust allocators Vitaly Wool
  2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
@ 2025-07-09 17:24 ` Vitaly Wool
  2025-07-10  7:45   ` Vlastimil Babka
  2025-07-11  8:58   ` Harry Yoo
  2025-07-09 17:24 ` [PATCH v12 3/4] rust: add support for NUMA ids in allocations Vitaly Wool
  2025-07-09 17:25 ` [PATCH v12 4/4] rust: support large alignments " Vitaly Wool
  3 siblings, 2 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-09 17:24 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato, Vitaly Wool

Reimplement k[v]realloc_node() to be able to set node and
alignment should a user need to do so. In order to do that while
retaining the maximal backward compatibility, add
k[v]realloc_node_align() functions and redefine the rest of API
using these new ones.

While doing that, we also keep the number of  _noprof variants to a
minimum, which implies some changes to the existing users of older
_noprof functions, that basically being bcachefs.

With that change we also provide the ability for the Rust part of
the kernel to set node and alignment in its K[v]xxx
[re]allocations.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
---
 fs/bcachefs/darray.c   |  2 +-
 fs/bcachefs/util.h     |  2 +-
 include/linux/bpfptr.h |  2 +-
 include/linux/slab.h   | 38 +++++++++++++++----------
 lib/rhashtable.c       |  4 +--
 mm/slub.c              | 64 +++++++++++++++++++++++++++++-------------
 6 files changed, 72 insertions(+), 40 deletions(-)

diff --git a/fs/bcachefs/darray.c b/fs/bcachefs/darray.c
index e86d36d23e9e..928e83a1ce42 100644
--- a/fs/bcachefs/darray.c
+++ b/fs/bcachefs/darray.c
@@ -21,7 +21,7 @@ int __bch2_darray_resize_noprof(darray_char *d, size_t element_size, size_t new_
 			return -ENOMEM;
 
 		void *data = likely(bytes < INT_MAX)
-			? kvmalloc_noprof(bytes, gfp)
+			? kvmalloc_node_align_noprof(bytes, 1, gfp, NUMA_NO_NODE)
 			: vmalloc_noprof(bytes);
 		if (!data)
 			return -ENOMEM;
diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h
index 6488f098d140..7112fd40ee21 100644
--- a/fs/bcachefs/util.h
+++ b/fs/bcachefs/util.h
@@ -61,7 +61,7 @@ static inline void *bch2_kvmalloc_noprof(size_t n, gfp_t flags)
 {
 	void *p = unlikely(n >= INT_MAX)
 		? vmalloc_noprof(n)
-		: kvmalloc_noprof(n, flags & ~__GFP_ZERO);
+		: kvmalloc_node_align_noprof(n, 1, flags & ~__GFP_ZERO, NUMA_NO_NODE);
 	if (p && (flags & __GFP_ZERO))
 		memset(p, 0, n);
 	return p;
diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h
index 1af241525a17..f6e0795db484 100644
--- a/include/linux/bpfptr.h
+++ b/include/linux/bpfptr.h
@@ -67,7 +67,7 @@ static inline int copy_to_bpfptr_offset(bpfptr_t dst, size_t offset,
 
 static inline void *kvmemdup_bpfptr_noprof(bpfptr_t src, size_t len)
 {
-	void *p = kvmalloc_noprof(len, GFP_USER | __GFP_NOWARN);
+	void *p = kvmalloc_node_align_noprof(len, 1, GFP_USER | __GFP_NOWARN, NUMA_NO_NODE);
 
 	if (!p)
 		return ERR_PTR(-ENOMEM);
diff --git a/include/linux/slab.h b/include/linux/slab.h
index d5a8ab98035c..2877900cb9a7 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -465,9 +465,13 @@ int kmem_cache_shrink(struct kmem_cache *s);
 /*
  * Common kmalloc functions provided by all allocators
  */
-void * __must_check krealloc_noprof(const void *objp, size_t new_size,
-				    gfp_t flags) __realloc_size(2);
-#define krealloc(...)				alloc_hooks(krealloc_noprof(__VA_ARGS__))
+void * __must_check krealloc_node_align_noprof(const void *objp, size_t new_size,
+					       unsigned long align,
+					       gfp_t flags, int nid) __realloc_size(2);
+#define krealloc_noprof(_o, _s, _f)	krealloc_node_align_noprof(_o, _s, 1, _f, NUMA_NO_NODE)
+#define krealloc_node_align(...)	alloc_hooks(krealloc_node_align_noprof(__VA_ARGS__))
+#define krealloc_node(_o, _s, _f, _n)	krealloc_node_align(_o, _s, 1, _f, _n)
+#define krealloc(...)			krealloc_node(__VA_ARGS__, NUMA_NO_NODE)
 
 void kfree(const void *objp);
 void kfree_sensitive(const void *objp);
@@ -1041,18 +1045,20 @@ static inline __alloc_size(1) void *kzalloc_noprof(size_t size, gfp_t flags)
 #define kzalloc(...)				alloc_hooks(kzalloc_noprof(__VA_ARGS__))
 #define kzalloc_node(_size, _flags, _node)	kmalloc_node(_size, (_flags)|__GFP_ZERO, _node)
 
-void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) __alloc_size(1);
-#define kvmalloc_node_noprof(size, flags, node)	\
-	__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(size, NULL), flags, node)
-#define kvmalloc_node(...)			alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__))
-
-#define kvmalloc(_size, _flags)			kvmalloc_node(_size, _flags, NUMA_NO_NODE)
-#define kvmalloc_noprof(_size, _flags)		kvmalloc_node_noprof(_size, _flags, NUMA_NO_NODE)
+void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), unsigned long align,
+			     gfp_t flags, int node) __alloc_size(1);
+#define kvmalloc_node_align_noprof(_size, _align, _flags, _node)	\
+	__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, NULL), _align, _flags, _node)
+#define kvmalloc_node_align(...)		\
+	alloc_hooks(kvmalloc_node_align_noprof(__VA_ARGS__))
+#define kvmalloc_node(_s, _f, _n)		kvmalloc_node_align(_s, 1, _f, _n)
+#define kvmalloc(...)				kvmalloc_node(__VA_ARGS__, NUMA_NO_NODE)
 #define kvzalloc(_size, _flags)			kvmalloc(_size, (_flags)|__GFP_ZERO)
 
 #define kvzalloc_node(_size, _flags, _node)	kvmalloc_node(_size, (_flags)|__GFP_ZERO, _node)
+
 #define kmem_buckets_valloc(_b, _size, _flags)	\
-	alloc_hooks(__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), _flags, NUMA_NO_NODE))
+	alloc_hooks(__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), 1, _flags, NUMA_NO_NODE))
 
 static inline __alloc_size(1, 2) void *
 kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
@@ -1062,7 +1068,7 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
 	if (unlikely(check_mul_overflow(n, size, &bytes)))
 		return NULL;
 
-	return kvmalloc_node_noprof(bytes, flags, node);
+	return kvmalloc_node_align_noprof(bytes, 1, flags, node);
 }
 
 #define kvmalloc_array_noprof(...)		kvmalloc_array_node_noprof(__VA_ARGS__, NUMA_NO_NODE)
@@ -1073,9 +1079,11 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
 #define kvcalloc_node(...)			alloc_hooks(kvcalloc_node_noprof(__VA_ARGS__))
 #define kvcalloc(...)				alloc_hooks(kvcalloc_noprof(__VA_ARGS__))
 
-void *kvrealloc_noprof(const void *p, size_t size, gfp_t flags)
-		__realloc_size(2);
-#define kvrealloc(...)				alloc_hooks(kvrealloc_noprof(__VA_ARGS__))
+void *kvrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
+				  gfp_t flags, int nid) __realloc_size(2);
+#define kvrealloc_node_align(...)		alloc_hooks(kvrealloc_node_align_noprof(__VA_ARGS__))
+#define kvrealloc_node(_p, _s, _f, _n)		kvrealloc_node_align(_p, _s, 1, _f, _n)
+#define kvrealloc(...)				kvrealloc_node(__VA_ARGS__, NUMA_NO_NODE)
 
 extern void kvfree(const void *addr);
 DEFINE_FREE(kvfree, void *, if (!IS_ERR_OR_NULL(_T)) kvfree(_T))
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 3e555d012ed6..fde0f0e556f8 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -184,8 +184,8 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
 	static struct lock_class_key __key;
 
 	tbl = alloc_hooks_tag(ht->alloc_tag,
-			kvmalloc_node_noprof(struct_size(tbl, buckets, nbuckets),
-					     gfp|__GFP_ZERO, NUMA_NO_NODE));
+			kvmalloc_node_align_noprof(struct_size(tbl, buckets, nbuckets),
+					     1, gfp|__GFP_ZERO, NUMA_NO_NODE));
 
 	size = nbuckets;
 
diff --git a/mm/slub.c b/mm/slub.c
index c4b64821e680..6fad4cdea6c4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4845,7 +4845,7 @@ void kfree(const void *object)
 EXPORT_SYMBOL(kfree);
 
 static __always_inline __realloc_size(2) void *
-__do_krealloc(const void *p, size_t new_size, gfp_t flags)
+__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
 {
 	void *ret;
 	size_t ks = 0;
@@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
 	if (!kasan_check_byte(p))
 		return NULL;
 
+	/* refuse to proceed if alignment is bigger than what kmalloc() provides */
+	if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
+		return NULL;
+
+	/*
+	 * If reallocation is not necessary (e. g. the new size is less
+	 * than the current allocated size), the current allocation will be
+	 * preserved unless __GFP_THISNODE is set. In the latter case a new
+	 * allocation on the requested node will be attempted.
+	 */
+	if (unlikely(flags & __GFP_THISNODE) && nid != NUMA_NO_NODE &&
+		     nid != page_to_nid(virt_to_page(p)))
+		goto alloc_new;
+
 	if (is_kfence_address(p)) {
 		ks = orig_size = kfence_ksize(p);
 	} else {
@@ -4903,7 +4917,7 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
 	return (void *)p;
 
 alloc_new:
-	ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
+	ret = kmalloc_node_track_caller_noprof(new_size, flags, nid, _RET_IP_);
 	if (ret && p) {
 		/* Disable KASAN checks as the object's redzone is accessed. */
 		kasan_disable_current();
@@ -4915,10 +4929,12 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
 }
 
 /**
- * krealloc - reallocate memory. The contents will remain unchanged.
+ * krealloc_node_align - reallocate memory. The contents will remain unchanged.
  * @p: object to reallocate memory for.
  * @new_size: how many bytes of memory are required.
+ * @align: desired alignment.
  * @flags: the type of memory to allocate.
+ * @nid: NUMA node or NUMA_NO_NODE
  *
  * If @p is %NULL, krealloc() behaves exactly like kmalloc().  If @new_size
  * is 0 and @p is not a %NULL pointer, the object pointed to is freed.
@@ -4947,7 +4963,8 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
  *
  * Return: pointer to the allocated memory or %NULL in case of error
  */
-void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
+void *krealloc_node_align_noprof(const void *p, size_t new_size, unsigned long align,
+				 gfp_t flags, int nid)
 {
 	void *ret;
 
@@ -4956,13 +4973,13 @@ void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
 		return ZERO_SIZE_PTR;
 	}
 
-	ret = __do_krealloc(p, new_size, flags);
+	ret = __do_krealloc(p, new_size, align, flags, nid);
 	if (ret && kasan_reset_tag(p) != kasan_reset_tag(ret))
 		kfree(p);
 
 	return ret;
 }
-EXPORT_SYMBOL(krealloc_noprof);
+EXPORT_SYMBOL(krealloc_node_align_noprof);
 
 static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size)
 {
@@ -4993,6 +5010,7 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size)
  * failure, fall back to non-contiguous (vmalloc) allocation.
  * @size: size of the request.
  * @b: which set of kmalloc buckets to allocate from.
+ * @align: desired alignment.
  * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL.
  * @node: numa node to allocate from
  *
@@ -5005,19 +5023,22 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size)
  *
  * Return: pointer to the allocated memory of %NULL in case of failure
  */
-void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
+void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), unsigned long align,
+			     gfp_t flags, int node)
 {
 	void *ret;
 
 	/*
 	 * It doesn't really make sense to fallback to vmalloc for sub page
-	 * requests
+	 * requests and small alignments
 	 */
-	ret = __do_kmalloc_node(size, PASS_BUCKET_PARAM(b),
-				kmalloc_gfp_adjust(flags, size),
-				node, _RET_IP_);
-	if (ret || size <= PAGE_SIZE)
-		return ret;
+	if (size >= align) {
+		ret = __do_kmalloc_node(size, PASS_BUCKET_PARAM(b),
+					kmalloc_gfp_adjust(flags, size),
+					node, _RET_IP_);
+		if (ret || size <= PAGE_SIZE)
+			return ret;
+	}
 
 	/* non-sleeping allocations are not supported by vmalloc */
 	if (!gfpflags_allow_blocking(flags))
@@ -5035,7 +5056,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
 	 * about the resulting pointer, and cannot play
 	 * protection games.
 	 */
-	return __vmalloc_node_range_noprof(size, 1, VMALLOC_START, VMALLOC_END,
+	return __vmalloc_node_range_noprof(size, align, VMALLOC_START, VMALLOC_END,
 			flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP,
 			node, __builtin_return_address(0));
 }
@@ -5079,10 +5100,12 @@ void kvfree_sensitive(const void *addr, size_t len)
 EXPORT_SYMBOL(kvfree_sensitive);
 
 /**
- * kvrealloc - reallocate memory; contents remain unchanged
+ * kvrealloc_node_align - reallocate memory; contents remain unchanged
  * @p: object to reallocate memory for
  * @size: the size to reallocate
+ * @align: desired alignment
  * @flags: the flags for the page level allocator
+ * @nid: NUMA node id
  *
  * If @p is %NULL, kvrealloc() behaves exactly like kvmalloc(). If @size is 0
  * and @p is not a %NULL pointer, the object pointed to is freed.
@@ -5100,17 +5123,18 @@ EXPORT_SYMBOL(kvfree_sensitive);
  *
  * Return: pointer to the allocated memory or %NULL in case of error
  */
-void *kvrealloc_noprof(const void *p, size_t size, gfp_t flags)
+void *kvrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
+				  gfp_t flags, int nid)
 {
 	void *n;
 
 	if (is_vmalloc_addr(p))
-		return vrealloc_noprof(p, size, flags);
+		return vrealloc_node_align_noprof(p, size, align, flags, nid);
 
-	n = krealloc_noprof(p, size, kmalloc_gfp_adjust(flags, size));
+	n = krealloc_node_align_noprof(p, size, align, kmalloc_gfp_adjust(flags, size), nid);
 	if (!n) {
 		/* We failed to krealloc(), fall back to kvmalloc(). */
-		n = kvmalloc_noprof(size, flags);
+		n = kvmalloc_node_align_noprof(size, align, flags, nid);
 		if (!n)
 			return NULL;
 
@@ -5126,7 +5150,7 @@ void *kvrealloc_noprof(const void *p, size_t size, gfp_t flags)
 
 	return n;
 }
-EXPORT_SYMBOL(kvrealloc_noprof);
+EXPORT_SYMBOL(kvrealloc_node_align_noprof);
 
 struct detached_freelist {
 	struct slab *slab;
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v12 3/4] rust: add support for NUMA ids in allocations
  2025-07-09 17:23 [PATCH v12 0/4] support large align and nid in Rust allocators Vitaly Wool
  2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
  2025-07-09 17:24 ` [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc Vitaly Wool
@ 2025-07-09 17:24 ` Vitaly Wool
  2025-07-09 20:58   ` Danilo Krummrich
  2025-07-09 17:25 ` [PATCH v12 4/4] rust: support large alignments " Vitaly Wool
  3 siblings, 1 reply; 28+ messages in thread
From: Vitaly Wool @ 2025-07-09 17:24 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato, Vitaly Wool

Add a new type to support specifying NUMA identifiers in Rust
allocators and extend the allocators to have NUMA id as a
parameter. Thus, modify ReallocFunc to use the new extended realloc
primitives from the C side of the kernel (i. e.
k[v]realloc_node_align/vrealloc_node_align) and add the new function
alloc_node to the Allocator trait while keeping the existing one
(alloc) for backward compatibility.

This will allow to specify node to use for allocation of e. g.
{KV}Box, as well as for future NUMA aware users of the API.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
---
 rust/helpers/slab.c            |  8 +++---
 rust/helpers/vmalloc.c         |  4 +--
 rust/kernel/alloc.rs           | 52 ++++++++++++++++++++++++++++++----
 rust/kernel/alloc/allocator.rs | 35 ++++++++++++++---------
 rust/kernel/alloc/kbox.rs      |  4 +--
 rust/kernel/alloc/kvec.rs      | 11 +++++--
 6 files changed, 86 insertions(+), 28 deletions(-)

diff --git a/rust/helpers/slab.c b/rust/helpers/slab.c
index a842bfbddcba..8472370a4338 100644
--- a/rust/helpers/slab.c
+++ b/rust/helpers/slab.c
@@ -3,13 +3,13 @@
 #include <linux/slab.h>
 
 void * __must_check __realloc_size(2)
-rust_helper_krealloc(const void *objp, size_t new_size, gfp_t flags)
+rust_helper_krealloc_node(const void *objp, size_t new_size, gfp_t flags, int node)
 {
-	return krealloc(objp, new_size, flags);
+	return krealloc_node(objp, new_size, flags, node);
 }
 
 void * __must_check __realloc_size(2)
-rust_helper_kvrealloc(const void *p, size_t size, gfp_t flags)
+rust_helper_kvrealloc_node(const void *p, size_t size, gfp_t flags, int node)
 {
-	return kvrealloc(p, size, flags);
+	return kvrealloc_node(p, size, flags, node);
 }
diff --git a/rust/helpers/vmalloc.c b/rust/helpers/vmalloc.c
index 80d34501bbc0..62d30db9a1a6 100644
--- a/rust/helpers/vmalloc.c
+++ b/rust/helpers/vmalloc.c
@@ -3,7 +3,7 @@
 #include <linux/vmalloc.h>
 
 void * __must_check __realloc_size(2)
-rust_helper_vrealloc(const void *p, size_t size, gfp_t flags)
+rust_helper_vrealloc_node(const void *p, size_t size, gfp_t flags, int node)
 {
-	return vrealloc(p, size, flags);
+	return vrealloc_node(p, size, flags, node);
 }
diff --git a/rust/kernel/alloc.rs b/rust/kernel/alloc.rs
index a2c49e5494d3..6ba1675c9da0 100644
--- a/rust/kernel/alloc.rs
+++ b/rust/kernel/alloc.rs
@@ -28,6 +28,8 @@
 /// Indicates an allocation error.
 #[derive(Copy, Clone, PartialEq, Eq, Debug)]
 pub struct AllocError;
+
+use crate::error::{code::EINVAL, Result};
 use core::{alloc::Layout, ptr::NonNull};
 
 /// Flags to be used when allocating memory.
@@ -115,6 +117,29 @@ pub mod flags {
     pub const __GFP_NOWARN: Flags = Flags(bindings::__GFP_NOWARN);
 }
 
+/// Non Uniform Memory Access (NUMA) node identifier
+#[derive(Clone, Copy, PartialEq)]
+pub struct NumaNode(i32);
+
+impl NumaNode {
+    /// create a new NUMA node identifer (non-negative integer)
+    /// returns EINVAL if a negative id or an id exceeding MAX_NUMNODES is specified
+    pub fn new(node: i32) -> Result<Self> {
+        // SAFETY: MAX_NUMNODES never exceeds 2**10 because NODES_SHIFT is 0..10
+        if node < 0 || node >= bindings::MAX_NUMNODES as i32 {
+            return Err(EINVAL);
+        }
+        Ok(Self(node))
+    }
+}
+
+/// Specify necessary constant to pass the information to Allocator that the caller doesn't care
+/// about the NUMA node to allocate memory from.
+impl NumaNode {
+    /// No node preference.
+    pub const NO_NODE: NumaNode = NumaNode(bindings::NUMA_NO_NODE);
+}
+
 /// The kernel's [`Allocator`] trait.
 ///
 /// An implementation of [`Allocator`] can allocate, re-allocate and free memory buffers described
@@ -137,7 +162,7 @@ pub mod flags {
 /// - Implementers must ensure that all trait functions abide by the guarantees documented in the
 ///   `# Guarantees` sections.
 pub unsafe trait Allocator {
-    /// Allocate memory based on `layout` and `flags`.
+    /// Allocate memory based on `layout`, `flags` and `nid`.
     ///
     /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the layout
     /// constraints (i.e. minimum size and alignment as specified by `layout`).
@@ -153,13 +178,21 @@ pub unsafe trait Allocator {
     ///
     /// Additionally, `Flags` are honored as documented in
     /// <https://docs.kernel.org/core-api/mm-api.html#mm-api-gfp-flags>.
-    fn alloc(layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {
+    fn alloc(layout: Layout, flags: Flags, nid: NumaNode) -> Result<NonNull<[u8]>, AllocError> {
         // SAFETY: Passing `None` to `realloc` is valid by its safety requirements and asks for a
         // new memory allocation.
-        unsafe { Self::realloc(None, layout, Layout::new::<()>(), flags) }
+        unsafe { Self::realloc(None, layout, Layout::new::<()>(), flags, nid) }
     }
 
-    /// Re-allocate an existing memory allocation to satisfy the requested `layout`.
+    /// Re-allocate an existing memory allocation to satisfy the requested `layout` and
+    /// a specific NUMA node request to allocate the memory for.
+    ///
+    /// Systems employing a Non Uniform Memory Access (NUMA) architecture contain collections of
+    /// hardware resources including processors, memory, and I/O buses, that comprise what is
+    /// commonly known as a NUMA node.
+    ///
+    /// `nid` stands for NUMA id, i. e. NUMA node identifier, which is a non-negative integer
+    /// if a node needs to be specified, or [`NumaNode::NO_NODE`] if the caller doesn't care.
     ///
     /// If the requested size is zero, `realloc` behaves equivalent to `free`.
     ///
@@ -196,6 +229,7 @@ unsafe fn realloc(
         layout: Layout,
         old_layout: Layout,
         flags: Flags,
+        nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError>;
 
     /// Free an existing memory allocation.
@@ -211,7 +245,15 @@ unsafe fn free(ptr: NonNull<u8>, layout: Layout) {
         // SAFETY: The caller guarantees that `ptr` points at a valid allocation created by this
         // allocator. We are passing a `Layout` with the smallest possible alignment, so it is
         // smaller than or equal to the alignment previously used with this allocation.
-        let _ = unsafe { Self::realloc(Some(ptr), Layout::new::<()>(), layout, Flags(0)) };
+        let _ = unsafe {
+            Self::realloc(
+                Some(ptr),
+                Layout::new::<()>(),
+                layout,
+                Flags(0),
+                NumaNode::NO_NODE,
+            )
+        };
     }
 }
 
diff --git a/rust/kernel/alloc/allocator.rs b/rust/kernel/alloc/allocator.rs
index aa2dfa9dca4c..8af7e04e3cc6 100644
--- a/rust/kernel/alloc/allocator.rs
+++ b/rust/kernel/alloc/allocator.rs
@@ -13,7 +13,7 @@
 use core::ptr;
 use core::ptr::NonNull;
 
-use crate::alloc::{AllocError, Allocator};
+use crate::alloc::{AllocError, Allocator, NumaNode};
 use crate::bindings;
 use crate::pr_warn;
 
@@ -56,20 +56,25 @@ fn aligned_size(new_layout: Layout) -> usize {
 
 /// # Invariants
 ///
-/// One of the following: `krealloc`, `vrealloc`, `kvrealloc`.
+/// One of the following: `krealloc_node`, `vrealloc_node`, `kvrealloc_node`.
 struct ReallocFunc(
-    unsafe extern "C" fn(*const crate::ffi::c_void, usize, u32) -> *mut crate::ffi::c_void,
+    unsafe extern "C" fn(
+        *const crate::ffi::c_void,
+        usize,
+        u32,
+        crate::ffi::c_int,
+    ) -> *mut crate::ffi::c_void,
 );
 
 impl ReallocFunc {
-    // INVARIANT: `krealloc` satisfies the type invariants.
-    const KREALLOC: Self = Self(bindings::krealloc);
+    // INVARIANT: `krealloc_node` satisfies the type invariants.
+    const KREALLOC: Self = Self(bindings::krealloc_node);
 
-    // INVARIANT: `vrealloc` satisfies the type invariants.
-    const VREALLOC: Self = Self(bindings::vrealloc);
+    // INVARIANT: `vrealloc_node` satisfies the type invariants.
+    const VREALLOC: Self = Self(bindings::vrealloc_node);
 
-    // INVARIANT: `kvrealloc` satisfies the type invariants.
-    const KVREALLOC: Self = Self(bindings::kvrealloc);
+    // INVARIANT: `kvrealloc_node` satisfies the type invariants.
+    const KVREALLOC: Self = Self(bindings::kvrealloc_node);
 
     /// # Safety
     ///
@@ -87,6 +92,7 @@ unsafe fn call(
         layout: Layout,
         old_layout: Layout,
         flags: Flags,
+        nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
         let size = aligned_size(layout);
         let ptr = match ptr {
@@ -110,7 +116,7 @@ unsafe fn call(
         // - Those functions provide the guarantees of this function.
         let raw_ptr = unsafe {
             // If `size == 0` and `ptr != NULL` the memory behind the pointer is freed.
-            self.0(ptr.cast(), size, flags.0).cast()
+            self.0(ptr.cast(), size, flags.0, nid.0).cast()
         };
 
         let ptr = if size == 0 {
@@ -134,9 +140,10 @@ unsafe fn realloc(
         layout: Layout,
         old_layout: Layout,
         flags: Flags,
+        nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
         // SAFETY: `ReallocFunc::call` has the same safety requirements as `Allocator::realloc`.
-        unsafe { ReallocFunc::KREALLOC.call(ptr, layout, old_layout, flags) }
+        unsafe { ReallocFunc::KREALLOC.call(ptr, layout, old_layout, flags, nid) }
     }
 }
 
@@ -151,6 +158,7 @@ unsafe fn realloc(
         layout: Layout,
         old_layout: Layout,
         flags: Flags,
+        nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
         // TODO: Support alignments larger than PAGE_SIZE.
         if layout.align() > bindings::PAGE_SIZE {
@@ -160,7 +168,7 @@ unsafe fn realloc(
 
         // SAFETY: If not `None`, `ptr` is guaranteed to point to valid memory, which was previously
         // allocated with this `Allocator`.
-        unsafe { ReallocFunc::VREALLOC.call(ptr, layout, old_layout, flags) }
+        unsafe { ReallocFunc::VREALLOC.call(ptr, layout, old_layout, flags, nid) }
     }
 }
 
@@ -175,6 +183,7 @@ unsafe fn realloc(
         layout: Layout,
         old_layout: Layout,
         flags: Flags,
+        nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
         // TODO: Support alignments larger than PAGE_SIZE.
         if layout.align() > bindings::PAGE_SIZE {
@@ -184,6 +193,6 @@ unsafe fn realloc(
 
         // SAFETY: If not `None`, `ptr` is guaranteed to point to valid memory, which was previously
         // allocated with this `Allocator`.
-        unsafe { ReallocFunc::KVREALLOC.call(ptr, layout, old_layout, flags) }
+        unsafe { ReallocFunc::KVREALLOC.call(ptr, layout, old_layout, flags, nid) }
     }
 }
diff --git a/rust/kernel/alloc/kbox.rs b/rust/kernel/alloc/kbox.rs
index c386ff771d50..5c0b020fb2a4 100644
--- a/rust/kernel/alloc/kbox.rs
+++ b/rust/kernel/alloc/kbox.rs
@@ -4,7 +4,7 @@
 
 #[allow(unused_imports)] // Used in doc comments.
 use super::allocator::{KVmalloc, Kmalloc, Vmalloc};
-use super::{AllocError, Allocator, Flags};
+use super::{AllocError, Allocator, Flags, NumaNode};
 use core::alloc::Layout;
 use core::fmt;
 use core::marker::PhantomData;
@@ -271,7 +271,7 @@ pub fn new(x: T, flags: Flags) -> Result<Self, AllocError> {
     /// ```
     pub fn new_uninit(flags: Flags) -> Result<Box<MaybeUninit<T>, A>, AllocError> {
         let layout = Layout::new::<MaybeUninit<T>>();
-        let ptr = A::alloc(layout, flags)?;
+        let ptr = A::alloc(layout, flags, NumaNode::NO_NODE)?;
 
         // INVARIANT: `ptr` is either a dangling pointer or points to memory allocated with `A`,
         // which is sufficient in size and alignment for storing a `T`.
diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
index 1a0dd852a468..aa5d27176d9c 100644
--- a/rust/kernel/alloc/kvec.rs
+++ b/rust/kernel/alloc/kvec.rs
@@ -5,7 +5,7 @@
 use super::{
     allocator::{KVmalloc, Kmalloc, Vmalloc},
     layout::ArrayLayout,
-    AllocError, Allocator, Box, Flags,
+    AllocError, Allocator, Box, Flags, NumaNode,
 };
 use core::{
     fmt,
@@ -633,6 +633,7 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
                 layout.into(),
                 self.layout.into(),
                 flags,
+                NumaNode::NO_NODE,
             )?
         };
 
@@ -1058,7 +1059,13 @@ pub fn collect(self, flags: Flags) -> Vec<T, A> {
             // the type invariant to be smaller than `cap`. Depending on `realloc` this operation
             // may shrink the buffer or leave it as it is.
             ptr = match unsafe {
-                A::realloc(Some(buf.cast()), layout.into(), old_layout.into(), flags)
+                A::realloc(
+                    Some(buf.cast()),
+                    layout.into(),
+                    old_layout.into(),
+                    flags,
+                    NumaNode::NO_NODE,
+                )
             } {
                 // If we fail to shrink, which likely can't even happen, continue with the existing
                 // buffer.
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v12 4/4] rust: support large alignments in allocations
  2025-07-09 17:23 [PATCH v12 0/4] support large align and nid in Rust allocators Vitaly Wool
                   ` (2 preceding siblings ...)
  2025-07-09 17:24 ` [PATCH v12 3/4] rust: add support for NUMA ids in allocations Vitaly Wool
@ 2025-07-09 17:25 ` Vitaly Wool
  2025-07-09 21:00   ` Danilo Krummrich
  3 siblings, 1 reply; 28+ messages in thread
From: Vitaly Wool @ 2025-07-09 17:25 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato, Vitaly Wool

Add support for large (> PAGE_SIZE) alignments in Rust allocators.
All the preparations on the C side are already done, we just need
to add bindings for <alloc>_node_align() functions and start
using those.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
Acked-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/helpers/slab.c            | 10 ++++++----
 rust/helpers/vmalloc.c         |  5 +++--
 rust/kernel/alloc/allocator.rs | 29 +++++++++--------------------
 3 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/rust/helpers/slab.c b/rust/helpers/slab.c
index 8472370a4338..d729be798f31 100644
--- a/rust/helpers/slab.c
+++ b/rust/helpers/slab.c
@@ -3,13 +3,15 @@
 #include <linux/slab.h>
 
 void * __must_check __realloc_size(2)
-rust_helper_krealloc_node(const void *objp, size_t new_size, gfp_t flags, int node)
+rust_helper_krealloc_node_align(const void *objp, size_t new_size, unsigned long align,
+				gfp_t flags, int node)
 {
-	return krealloc_node(objp, new_size, flags, node);
+	return krealloc_node_align(objp, new_size, align, flags, node);
 }
 
 void * __must_check __realloc_size(2)
-rust_helper_kvrealloc_node(const void *p, size_t size, gfp_t flags, int node)
+rust_helper_kvrealloc_node_align(const void *p, size_t size, unsigned long align,
+				gfp_t flags, int node)
 {
-	return kvrealloc_node(p, size, flags, node);
+	return kvrealloc_node_align(p, size, align, flags, node);
 }
diff --git a/rust/helpers/vmalloc.c b/rust/helpers/vmalloc.c
index 62d30db9a1a6..7d7f7336b3d2 100644
--- a/rust/helpers/vmalloc.c
+++ b/rust/helpers/vmalloc.c
@@ -3,7 +3,8 @@
 #include <linux/vmalloc.h>
 
 void * __must_check __realloc_size(2)
-rust_helper_vrealloc_node(const void *p, size_t size, gfp_t flags, int node)
+rust_helper_vrealloc_node_align(const void *p, size_t size, unsigned long align,
+				gfp_t flags, int node)
 {
-	return vrealloc_node(p, size, flags, node);
+	return vrealloc_node_align(p, size, align, flags, node);
 }
diff --git a/rust/kernel/alloc/allocator.rs b/rust/kernel/alloc/allocator.rs
index 8af7e04e3cc6..429d1b060f31 100644
--- a/rust/kernel/alloc/allocator.rs
+++ b/rust/kernel/alloc/allocator.rs
@@ -56,25 +56,26 @@ fn aligned_size(new_layout: Layout) -> usize {
 
 /// # Invariants
 ///
-/// One of the following: `krealloc_node`, `vrealloc_node`, `kvrealloc_node`.
+/// One of the following: `krealloc_node_align`, `vrealloc_node_align`, `kvrealloc_node_align`.
 struct ReallocFunc(
     unsafe extern "C" fn(
         *const crate::ffi::c_void,
         usize,
+        crate::ffi::c_ulong,
         u32,
         crate::ffi::c_int,
     ) -> *mut crate::ffi::c_void,
 );
 
 impl ReallocFunc {
-    // INVARIANT: `krealloc_node` satisfies the type invariants.
-    const KREALLOC: Self = Self(bindings::krealloc_node);
+    // INVARIANT: `krealloc_node_align` satisfies the type invariants.
+    const KREALLOC: Self = Self(bindings::krealloc_node_align);
 
-    // INVARIANT: `vrealloc_node` satisfies the type invariants.
-    const VREALLOC: Self = Self(bindings::vrealloc_node);
+    // INVARIANT: `vrealloc_node_align` satisfies the type invariants.
+    const VREALLOC: Self = Self(bindings::vrealloc_node_align);
 
-    // INVARIANT: `kvrealloc_node` satisfies the type invariants.
-    const KVREALLOC: Self = Self(bindings::kvrealloc_node);
+    // INVARIANT: `kvrealloc_node_align` satisfies the type invariants.
+    const KVREALLOC: Self = Self(bindings::kvrealloc_node_align);
 
     /// # Safety
     ///
@@ -116,7 +117,7 @@ unsafe fn call(
         // - Those functions provide the guarantees of this function.
         let raw_ptr = unsafe {
             // If `size == 0` and `ptr != NULL` the memory behind the pointer is freed.
-            self.0(ptr.cast(), size, flags.0, nid.0).cast()
+            self.0(ptr.cast(), size, layout.align(), flags.0, nid.0).cast()
         };
 
         let ptr = if size == 0 {
@@ -160,12 +161,6 @@ unsafe fn realloc(
         flags: Flags,
         nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
-        // TODO: Support alignments larger than PAGE_SIZE.
-        if layout.align() > bindings::PAGE_SIZE {
-            pr_warn!("Vmalloc does not support alignments larger than PAGE_SIZE yet.\n");
-            return Err(AllocError);
-        }
-
         // SAFETY: If not `None`, `ptr` is guaranteed to point to valid memory, which was previously
         // allocated with this `Allocator`.
         unsafe { ReallocFunc::VREALLOC.call(ptr, layout, old_layout, flags, nid) }
@@ -185,12 +180,6 @@ unsafe fn realloc(
         flags: Flags,
         nid: NumaNode,
     ) -> Result<NonNull<[u8]>, AllocError> {
-        // TODO: Support alignments larger than PAGE_SIZE.
-        if layout.align() > bindings::PAGE_SIZE {
-            pr_warn!("KVmalloc does not support alignments larger than PAGE_SIZE yet.\n");
-            return Err(AllocError);
-        }
-
         // SAFETY: If not `None`, `ptr` is guaranteed to point to valid memory, which was previously
         // allocated with this `Allocator`.
         unsafe { ReallocFunc::KVREALLOC.call(ptr, layout, old_layout, flags, nid) }
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
@ 2025-07-09 19:01   ` Liam R. Howlett
  2025-07-10  6:21     ` Vitaly Wool
  2025-07-10 14:07     ` Vitaly Wool
  2025-07-09 22:53   ` Alexei Starovoitov
  1 sibling, 2 replies; 28+ messages in thread
From: Liam R. Howlett @ 2025-07-09 19:01 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

* Vitaly Wool <vitaly.wool@konsulko.se> [250709 13:24]:
> Reimplement vrealloc() to be able to set node and alignment should
> a user need to do so. Rename the function to vrealloc_node_align()
> to better match what it actually does now and introduce macros for
> vrealloc() and friends for backward compatibility.
> 
> With that change we also provide the ability for the Rust part of
> the kernel to set node and alignment in its allocations.
> 
> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  include/linux/vmalloc.h | 12 +++++++++---
>  mm/nommu.c              |  3 ++-
>  mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
>  3 files changed, 37 insertions(+), 9 deletions(-)
> 
...

> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6dbcdceecae1..03dd06097b25 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
>  EXPORT_SYMBOL(vzalloc_node_noprof);
>  
>  /**
> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
> + * remain unchanged
>   * @p: object to reallocate memory for
>   * @size: the size to reallocate
> + * @align: requested alignment
>   * @flags: the flags for the page level allocator
> + * @nid: node number of the target node
> + *
> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
>   *
> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
> - * @p is not a %NULL pointer, the object pointed to is freed.
> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
> + * the given node. If reallocation is not necessary (e. g. the new size is less
> + * than the current allocated size), the current allocation will be preserved
> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
> + * requested node will be attempted.

I am having a very hard time understanding what you mean here.  What is
the latter case?

If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
node.  Then things sort of get confusing.  What is the latter case?

>   *
>   * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
>   * initial memory allocation, every subsequent call to this API for the same
>   * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
>   * __GFP_ZERO is not fully honored by this API.
>   *
> + * If the requested alignment is bigger than the one the *existing* allocation
> + * has, this function will fail.
> + *

It might be better to say something like:
Requesting an alignment that is bigger than the alignment of the
*existing* allocation will fail.

...

Thanks,
Liam


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 3/4] rust: add support for NUMA ids in allocations
  2025-07-09 17:24 ` [PATCH v12 3/4] rust: add support for NUMA ids in allocations Vitaly Wool
@ 2025-07-09 20:58   ` Danilo Krummrich
  0 siblings, 0 replies; 28+ messages in thread
From: Danilo Krummrich @ 2025-07-09 20:58 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Alice Ryhl,
	Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 09, 2025 at 07:24:58PM +0200, Vitaly Wool wrote:
> Add a new type to support specifying NUMA identifiers in Rust
> allocators and extend the allocators to have NUMA id as a
> parameter. Thus, modify ReallocFunc to use the new extended realloc
> primitives from the C side of the kernel (i. e.
> k[v]realloc_node_align/vrealloc_node_align) and add the new function
> alloc_node to the Allocator trait while keeping the existing one
> (alloc) for backward compatibility.
> 
> This will allow to specify node to use for allocation of e. g.
> {KV}Box, as well as for future NUMA aware users of the API.
> 
> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>

> +/// Non Uniform Memory Access (NUMA) node identifier

Please end with a period.

> +#[derive(Clone, Copy, PartialEq)]
> +pub struct NumaNode(i32);
> +
> +impl NumaNode {
> +    /// create a new NUMA node identifer (non-negative integer)

s/identifer/identifier/

Please also add an empty line in between those two.

> +    /// returns EINVAL if a negative id or an id exceeding MAX_NUMNODES is specified

Please start with a capital letter, use markdown and end with a period.

> +    pub fn new(node: i32) -> Result<Self> {
> +        // SAFETY: MAX_NUMNODES never exceeds 2**10 because NODES_SHIFT is 0..10

This must not be a safety comment, but a normal one. Please use markdown and end
the sentence with a period.

> +        if node < 0 || node >= bindings::MAX_NUMNODES as i32 {
> +            return Err(EINVAL);
> +        }
> +        Ok(Self(node))
> +    }
> +}

With that fixed,

	Acked-by: Danilo Krummrich <dakr@kernel.org>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 4/4] rust: support large alignments in allocations
  2025-07-09 17:25 ` [PATCH v12 4/4] rust: support large alignments " Vitaly Wool
@ 2025-07-09 21:00   ` Danilo Krummrich
  0 siblings, 0 replies; 28+ messages in thread
From: Danilo Krummrich @ 2025-07-09 21:00 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Alice Ryhl,
	Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 09, 2025 at 07:25:09PM +0200, Vitaly Wool wrote:
>  void * __must_check __realloc_size(2)
> -rust_helper_krealloc_node(const void *objp, size_t new_size, gfp_t flags, int node)
> +rust_helper_krealloc_node_align(const void *objp, size_t new_size, unsigned long align,
> +				gfp_t flags, int node)

CHECK: Alignment should match open parenthesis
#38: FILE: rust/helpers/slab.c:14:
+rust_helper_kvrealloc_node_align(const void *p, size_t size, unsigned long align,
+                               gfp_t flags, int node)

total: 0 errors, 0 warnings, 1 checks, 94 lines checked

Please make sure to always run scripts/checkpatch.pl. :)

> @@ -185,12 +180,6 @@ unsafe fn realloc(
>          flags: Flags,
>          nid: NumaNode,
>      ) -> Result<NonNull<[u8]>, AllocError> {
> -        // TODO: Support alignments larger than PAGE_SIZE.
> -        if layout.align() > bindings::PAGE_SIZE {
> -            pr_warn!("KVmalloc does not support alignments larger than PAGE_SIZE yet.\n");
> -            return Err(AllocError);
> -        }
> -

Since you remove the pr_warn!(), you also have to remove the corresponding
import, otherwise you get a clippy warning.

Please build with CLIPPY=1, see also [1].

[1] https://rust-for-linux.com/contributing#submit-checklist-addendum


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
  2025-07-09 19:01   ` Liam R. Howlett
@ 2025-07-09 22:53   ` Alexei Starovoitov
  2025-07-09 22:57     ` Danilo Krummrich
  1 sibling, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2025-07-09 22:53 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, Andrew Morton, LKML, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
>
>
> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
> +                                gfp_t flags, int node)
>  {

imo this is a silly pattern to rename functions because they
got new arguments.
The names of the args are clear enough "align" and "node".
I see no point in adding the same suffixes to a function name.
In the future this function will receive another argument and
the function would be renamed again?!
"_noprof" suffix makes sense, since it's there for alloc_hooks,
but "_node_align_" is unnecessary.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 22:53   ` Alexei Starovoitov
@ 2025-07-09 22:57     ` Danilo Krummrich
  2025-07-09 23:14       ` Alexei Starovoitov
  0 siblings, 1 reply; 28+ messages in thread
From: Danilo Krummrich @ 2025-07-09 22:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Vitaly Wool, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
> On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
>>
>>
>> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
>> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
>> +                                gfp_t flags, int node)
>>   {
> 
> imo this is a silly pattern to rename functions because they
> got new arguments.
> The names of the args are clear enough "align" and "node".
> I see no point in adding the same suffixes to a function name.
> In the future this function will receive another argument and
> the function would be renamed again?!
> "_noprof" suffix makes sense, since it's there for alloc_hooks,
> but "_node_align_" is unnecessary.

Do you have an alternative proposal given that we also have vrealloc() and
vrealloc_node()?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 22:57     ` Danilo Krummrich
@ 2025-07-09 23:14       ` Alexei Starovoitov
  2025-07-09 23:26         ` Danilo Krummrich
  0 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2025-07-09 23:14 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Vitaly Wool, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 9, 2025 at 3:57 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
> > On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
> >>
> >>
> >> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
> >> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
> >> +                                gfp_t flags, int node)
> >>   {
> >
> > imo this is a silly pattern to rename functions because they
> > got new arguments.
> > The names of the args are clear enough "align" and "node".
> > I see no point in adding the same suffixes to a function name.
> > In the future this function will receive another argument and
> > the function would be renamed again?!
> > "_noprof" suffix makes sense, since it's there for alloc_hooks,
> > but "_node_align_" is unnecessary.
>
> Do you have an alternative proposal given that we also have vrealloc() and
> vrealloc_node()?

vrealloc_node()?! There is no such thing in the tree.
There are various k[zm]alloc_node() which are artifacts of the past
when NUMA just appeared and people cared about CONFIG_NUMA vs not.
Nowadays NUMA is everywhere and any new code must support NUMA
from the start. Hence no point in carrying old baggage and obsolete names.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 23:14       ` Alexei Starovoitov
@ 2025-07-09 23:26         ` Danilo Krummrich
  2025-07-10  0:39           ` Alexei Starovoitov
  0 siblings, 1 reply; 28+ messages in thread
From: Danilo Krummrich @ 2025-07-09 23:26 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Vitaly Wool, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Thu Jul 10, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
> On Wed, Jul 9, 2025 at 3:57 PM Danilo Krummrich <dakr@kernel.org> wrote:
>>
>> On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
>> > On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
>> >>
>> >>
>> >> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
>> >> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
>> >> +                                gfp_t flags, int node)
>> >>   {
>> >
>> > imo this is a silly pattern to rename functions because they
>> > got new arguments.
>> > The names of the args are clear enough "align" and "node".
>> > I see no point in adding the same suffixes to a function name.
>> > In the future this function will receive another argument and
>> > the function would be renamed again?!
>> > "_noprof" suffix makes sense, since it's there for alloc_hooks,
>> > but "_node_align_" is unnecessary.
>>
>> Do you have an alternative proposal given that we also have vrealloc() and
>> vrealloc_node()?
>
> vrealloc_node()?! There is no such thing in the tree.
> There are various k[zm]alloc_node() which are artifacts of the past
> when NUMA just appeared and people cared about CONFIG_NUMA vs not.
> Nowadays NUMA is everywhere and any new code must support NUMA
> from the start. Hence no point in carrying old baggage and obsolete names.

This patch adds it; do you suggest to redefine vrealloc_noprof() to take align
and nid? If we don't mind being inconsistent with krealloc_noprof() and
kvrealloc_noprof() that's fine I guess.

FWIW, I prefer consistency.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 23:26         ` Danilo Krummrich
@ 2025-07-10  0:39           ` Alexei Starovoitov
  2025-07-10  6:16             ` Vitaly Wool
  2025-07-10  9:57             ` Danilo Krummrich
  0 siblings, 2 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2025-07-10  0:39 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Vitaly Wool, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 9, 2025 at 4:26 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu Jul 10, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
> > On Wed, Jul 9, 2025 at 3:57 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >>
> >> On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
> >> > On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
> >> >>
> >> >>
> >> >> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
> >> >> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
> >> >> +                                gfp_t flags, int node)
> >> >>   {
> >> >
> >> > imo this is a silly pattern to rename functions because they
> >> > got new arguments.
> >> > The names of the args are clear enough "align" and "node".
> >> > I see no point in adding the same suffixes to a function name.
> >> > In the future this function will receive another argument and
> >> > the function would be renamed again?!
> >> > "_noprof" suffix makes sense, since it's there for alloc_hooks,
> >> > but "_node_align_" is unnecessary.
> >>
> >> Do you have an alternative proposal given that we also have vrealloc() and
> >> vrealloc_node()?
> >
> > vrealloc_node()?! There is no such thing in the tree.
> > There are various k[zm]alloc_node() which are artifacts of the past
> > when NUMA just appeared and people cared about CONFIG_NUMA vs not.
> > Nowadays NUMA is everywhere and any new code must support NUMA
> > from the start. Hence no point in carrying old baggage and obsolete names.
>
> This patch adds it; do you suggest to redefine vrealloc_noprof() to take align
> and nid? If we don't mind being inconsistent with krealloc_noprof() and
> kvrealloc_noprof() that's fine I guess.
>
> FWIW, I prefer consistency.

What inconsistency are you talking about? That
krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
and
vrealloc_noprof(const void *p, size_t size, unsigned long align,
                gfp_t flags, int node)
have different number of arguments?!

See:
alloc_pages_noprof(gfp_t gfp, unsigned int order);
__alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
                nodemask_t *nodemask);

Adding double underscore to keep all existing callers of
vrealloc_noprof() without changes and do:

vrealloc_noprof(const void *p, size_t size, gfp_t flags);
__vrealloc_noprof(const void *p, size_t size, unsigned long align,
gfp_t flags, int node);

is fine and consistent with how things were done in the past,
but adding "_node_align_" to the function name and code churn to all
callsites is a cargo cult.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-10  0:39           ` Alexei Starovoitov
@ 2025-07-10  6:16             ` Vitaly Wool
  2025-07-10  9:57             ` Danilo Krummrich
  1 sibling, 0 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-10  6:16 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Danilo Krummrich, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato



> On Jul 10, 2025, at 2:39 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Wed, Jul 9, 2025 at 4:26 PM Danilo Krummrich <dakr@kernel.org> wrote:
>> 
>> On Thu Jul 10, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
>>> On Wed, Jul 9, 2025 at 3:57 PM Danilo Krummrich <dakr@kernel.org> wrote:
>>>> 
>>>> On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
>>>>> On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
>>>>>> 
>>>>>> 
>>>>>> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
>>>>>> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
>>>>>> +                                gfp_t flags, int node)
>>>>>>  {
>>>>> 
>>>>> imo this is a silly pattern to rename functions because they
>>>>> got new arguments.
>>>>> The names of the args are clear enough "align" and "node".
>>>>> I see no point in adding the same suffixes to a function name.
>>>>> In the future this function will receive another argument and
>>>>> the function would be renamed again?!
>>>>> "_noprof" suffix makes sense, since it's there for alloc_hooks,
>>>>> but "_node_align_" is unnecessary.
>>>> 
>>>> Do you have an alternative proposal given that we also have vrealloc() and
>>>> vrealloc_node()?
>>> 
>>> vrealloc_node()?! There is no such thing in the tree.
>>> There are various k[zm]alloc_node() which are artifacts of the past
>>> when NUMA just appeared and people cared about CONFIG_NUMA vs not.
>>> Nowadays NUMA is everywhere and any new code must support NUMA
>>> from the start. Hence no point in carrying old baggage and obsolete names.
>> 
>> This patch adds it; do you suggest to redefine vrealloc_noprof() to take align
>> and nid? If we don't mind being inconsistent with krealloc_noprof() and
>> kvrealloc_noprof() that's fine I guess.
>> 
>> FWIW, I prefer consistency.
> 
> What inconsistency are you talking about? That
> krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
> and
> vrealloc_noprof(const void *p, size_t size, unsigned long align,
>                gfp_t flags, int node)
> have different number of arguments?!
> 
> See:
> alloc_pages_noprof(gfp_t gfp, unsigned int order);
> __alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
>                nodemask_t *nodemask);
> 
> Adding double underscore to keep all existing callers of
> vrealloc_noprof() without changes and do:
> 
> vrealloc_noprof(const void *p, size_t size, gfp_t flags);
> __vrealloc_noprof(const void *p, size_t size, unsigned long align,
> gfp_t flags, int node);
> 
> is fine and consistent with how things were done in the past,
> but adding "_node_align_" to the function name and code churn to all
> callsites is a cargo cult.
> 

I see your point but your approach is currently only applicable to vmalloc and it will not work for slub because the latter already has __kmalloc_node, __kvmalloc_node etc. and we want to keep at least some naming consistency between k[v]* and v* functions.

This whole patchset is only intended to add the capability to set node and properly handle alignment requests in Rust allocators, and is thus well aligned with your idea that the new code must support NUMA (which I do share). I would suggest that we get this in as it is, and then I can take the burden of straightening out the naming which will inevitably lead to many modifications in other parts of the kernel. The latter I am fine with, too, but not in this series.

~Vitaly

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 19:01   ` Liam R. Howlett
@ 2025-07-10  6:21     ` Vitaly Wool
  2025-07-10 15:09       ` Liam R. Howlett
  2025-07-10 15:19       ` Lorenzo Stoakes
  2025-07-10 14:07     ` Vitaly Wool
  1 sibling, 2 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-10  6:21 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

[-- Attachment #1: Type: text/plain, Size: 3473 bytes --]



> On Jul 9, 2025, at 9:01 PM, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> 
> * Vitaly Wool <vitaly.wool@konsulko.se <mailto:vitaly.wool@konsulko.se>> [250709 13:24]:
>> Reimplement vrealloc() to be able to set node and alignment should
>> a user need to do so. Rename the function to vrealloc_node_align()
>> to better match what it actually does now and introduce macros for
>> vrealloc() and friends for backward compatibility.
>> 
>> With that change we also provide the ability for the Rust part of
>> the kernel to set node and alignment in its allocations.
>> 
>> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
>> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>> include/linux/vmalloc.h | 12 +++++++++---
>> mm/nommu.c              |  3 ++-
>> mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
>> 3 files changed, 37 insertions(+), 9 deletions(-)
>> 
> ...
> 
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 6dbcdceecae1..03dd06097b25 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
>> EXPORT_SYMBOL(vzalloc_node_noprof);
>> 
>> /**
>> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
>> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
>> + * remain unchanged
>>  * @p: object to reallocate memory for
>>  * @size: the size to reallocate
>> + * @align: requested alignment
>>  * @flags: the flags for the page level allocator
>> + * @nid: node number of the target node
>> + *
>> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
>> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
>>  *
>> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
>> - * @p is not a %NULL pointer, the object pointed to is freed.
>> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
>> + * the given node. If reallocation is not necessary (e. g. the new size is less
>> + * than the current allocated size), the current allocation will be preserved
>> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
>> + * requested node will be attempted.
> 
> I am having a very hard time understanding what you mean here.  What is
> the latter case?
> 
> If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
> node.  Then things sort of get confusing.  What is the latter case?

The latter case is __GFP_THISNODE present in flags. That’s the latest if-clause in this paragraph.
> 
>>  *
>>  * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
>>  * initial memory allocation, every subsequent call to this API for the same
>>  * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
>>  * __GFP_ZERO is not fully honored by this API.
>>  *
>> + * If the requested alignment is bigger than the one the *existing* allocation
>> + * has, this function will fail.
>> + *
> 
> It might be better to say something like:
> Requesting an alignment that is bigger than the alignment of the
> *existing* allocation will fail.
> 

The whole function description in fact consists of several if-clauses (some of which are nested) so I am just following the pattern here.

~Vitaly

[-- Attachment #2: Type: text/html, Size: 13996 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-09 17:24 ` [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc Vitaly Wool
@ 2025-07-10  7:45   ` Vlastimil Babka
  2025-07-11  8:58   ` Harry Yoo
  1 sibling, 0 replies; 28+ messages in thread
From: Vlastimil Babka @ 2025-07-10  7:45 UTC (permalink / raw)
  To: Vitaly Wool, linux-mm
  Cc: akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, rust-for-linux, Lorenzo Stoakes, Liam R . Howlett,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

On 7/9/25 19:24, Vitaly Wool wrote:
> Reimplement k[v]realloc_node() to be able to set node and
> alignment should a user need to do so. In order to do that while
> retaining the maximal backward compatibility, add
> k[v]realloc_node_align() functions and redefine the rest of API
> using these new ones.
> 
> While doing that, we also keep the number of  _noprof variants to a
> minimum, which implies some changes to the existing users of older
> _noprof functions, that basically being bcachefs.
> 
> With that change we also provide the ability for the Rust part of
> the kernel to set node and alignment in its K[v]xxx
> [re]allocations.
> 
> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-10  0:39           ` Alexei Starovoitov
  2025-07-10  6:16             ` Vitaly Wool
@ 2025-07-10  9:57             ` Danilo Krummrich
  1 sibling, 0 replies; 28+ messages in thread
From: Danilo Krummrich @ 2025-07-10  9:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Vitaly Wool, linux-mm, Andrew Morton, LKML, Uladzislau Rezki,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Thu Jul 10, 2025 at 2:39 AM CEST, Alexei Starovoitov wrote:
> On Wed, Jul 9, 2025 at 4:26 PM Danilo Krummrich <dakr@kernel.org> wrote:
>>
>> On Thu Jul 10, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
>> > On Wed, Jul 9, 2025 at 3:57 PM Danilo Krummrich <dakr@kernel.org> wrote:
>> >>
>> >> On 7/10/25 12:53 AM, Alexei Starovoitov wrote:
>> >> > On Wed, Jul 9, 2025 at 10:25 AM Vitaly Wool <vitaly.wool@konsulko.se> wrote:
>> >> >>
>> >> >>
>> >> >> -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags)
>> >> >> +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align,
>> >> >> +                                gfp_t flags, int node)
>> >> >>   {
>> >> >
>> >> > imo this is a silly pattern to rename functions because they
>> >> > got new arguments.
>> >> > The names of the args are clear enough "align" and "node".
>> >> > I see no point in adding the same suffixes to a function name.
>> >> > In the future this function will receive another argument and
>> >> > the function would be renamed again?!
>> >> > "_noprof" suffix makes sense, since it's there for alloc_hooks,
>> >> > but "_node_align_" is unnecessary.
>> >>
>> >> Do you have an alternative proposal given that we also have vrealloc() and
>> >> vrealloc_node()?
>> >
>> > vrealloc_node()?! There is no such thing in the tree.
>> > There are various k[zm]alloc_node() which are artifacts of the past
>> > when NUMA just appeared and people cared about CONFIG_NUMA vs not.
>> > Nowadays NUMA is everywhere and any new code must support NUMA
>> > from the start. Hence no point in carrying old baggage and obsolete names.
>>
>> This patch adds it; do you suggest to redefine vrealloc_noprof() to take align
>> and nid? If we don't mind being inconsistent with krealloc_noprof() and
>> kvrealloc_noprof() that's fine I guess.
>>
>> FWIW, I prefer consistency.
>
> What inconsistency are you talking about? That
> krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
> and
> vrealloc_noprof(const void *p, size_t size, unsigned long align,
>                 gfp_t flags, int node)
> have different number of arguments?!
>
> See:
> alloc_pages_noprof(gfp_t gfp, unsigned int order);
> __alloc_pages_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
>                 nodemask_t *nodemask);
>
> Adding double underscore to keep all existing callers of
> vrealloc_noprof() without changes and do:
>
> vrealloc_noprof(const void *p, size_t size, gfp_t flags);
> __vrealloc_noprof(const void *p, size_t size, unsigned long align,
> gfp_t flags, int node);
>
> is fine and consistent with how things were done in the past,
> but adding "_node_align_" to the function name and code churn to all
> callsites is a cargo cult.

As Vitaly mentioned in a different reply, this would be inconsistent with the
'k' and 'kv' variants, which have the suffix '_node'.

Anyways, in general I don't think that adding underscores for functions that
basically do the same thing but are getting more specialized is a great pattern
for things that are not strictly limited to a narrow context.

Please note, I'm not saying we should encode additional arguments in the name
either. I think it really depends on the actual case.

In this case, it seems to make sense to me that there is e.g. kmalloc() and
kmalloc_node().

For a caller that's much more useful, i.e. I want classic kmalloc(), but want to
set the node, hence kmalloc_node(). Calling it __kmalloc() instead seems a bit
random.

Or do you only refer to the *_noprof() variants, which are not exported to
users? But even then, underscores still don't seem very expressive.

I'm not maintaining this code though, so just take it FWIW. :)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-09 19:01   ` Liam R. Howlett
  2025-07-10  6:21     ` Vitaly Wool
@ 2025-07-10 14:07     ` Vitaly Wool
  1 sibling, 0 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-10 14:07 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato



> On Jul 9, 2025, at 9:01 PM, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> 
> * Vitaly Wool <vitaly.wool@konsulko.se> [250709 13:24]:
>> Reimplement vrealloc() to be able to set node and alignment should
>> a user need to do so. Rename the function to vrealloc_node_align()
>> to better match what it actually does now and introduce macros for
>> vrealloc() and friends for backward compatibility.
>> 
>> With that change we also provide the ability for the Rust part of
>> the kernel to set node and alignment in its allocations.
>> 
>> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
>> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>> include/linux/vmalloc.h | 12 +++++++++---
>> mm/nommu.c              |  3 ++-
>> mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
>> 3 files changed, 37 insertions(+), 9 deletions(-)
>> 
> ...
> 
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 6dbcdceecae1..03dd06097b25 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
>> EXPORT_SYMBOL(vzalloc_node_noprof);
>> 
>> /**
>> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
>> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
>> + * remain unchanged
>>  * @p: object to reallocate memory for
>>  * @size: the size to reallocate
>> + * @align: requested alignment
>>  * @flags: the flags for the page level allocator
>> + * @nid: node number of the target node
>> + *
>> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
>> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
>>  *
>> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
>> - * @p is not a %NULL pointer, the object pointed to is freed.
>> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
>> + * the given node. If reallocation is not necessary (e. g. the new size is less
>> + * than the current allocated size), the current allocation will be preserved
>> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
>> + * requested node will be attempted.
> 
> I am having a very hard time understanding what you mean here.  What is
> the latter case?
> 
> If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
> node.  Then things sort of get confusing.  What is the latter case?

The latter case is __GFP_THISNODE present in flags. That’s the latest if-clause in this paragraph.

> 
>>  *
>>  * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
>>  * initial memory allocation, every subsequent call to this API for the same
>>  * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
>>  * __GFP_ZERO is not fully honored by this API.
>>  *
>> + * If the requested alignment is bigger than the one the *existing* allocation
>> + * has, this function will fail.
>> + *
> 
> It might be better to say something like:
> Requesting an alignment that is bigger than the alignment of the
> *existing* allocation will fail.
> 
The whole function description in fact consists of several if-clauses (some of which are nested) so I am just following the pattern here.

~Vitaly




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-10  6:21     ` Vitaly Wool
@ 2025-07-10 15:09       ` Liam R. Howlett
  2025-07-10 15:19       ` Lorenzo Stoakes
  1 sibling, 0 replies; 28+ messages in thread
From: Liam R. Howlett @ 2025-07-10 15:09 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

* Vitaly Wool <vitaly.wool@konsulko.se> [250710 02:21]:
> 
> 
> > On Jul 9, 2025, at 9:01 PM, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > 
> > * Vitaly Wool <vitaly.wool@konsulko.se <mailto:vitaly.wool@konsulko.se>> [250709 13:24]:
> >> Reimplement vrealloc() to be able to set node and alignment should
> >> a user need to do so. Rename the function to vrealloc_node_align()
> >> to better match what it actually does now and introduce macros for
> >> vrealloc() and friends for backward compatibility.
> >> 
> >> With that change we also provide the ability for the Rust part of
> >> the kernel to set node and alignment in its allocations.
> >> 
> >> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
> >> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >> ---
> >> include/linux/vmalloc.h | 12 +++++++++---
> >> mm/nommu.c              |  3 ++-
> >> mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
> >> 3 files changed, 37 insertions(+), 9 deletions(-)
> >> 
> > ...
> > 
> >> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >> index 6dbcdceecae1..03dd06097b25 100644
> >> --- a/mm/vmalloc.c
> >> +++ b/mm/vmalloc.c
> >> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
> >> EXPORT_SYMBOL(vzalloc_node_noprof);
> >> 
> >> /**
> >> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
> >> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
> >> + * remain unchanged
> >>  * @p: object to reallocate memory for
> >>  * @size: the size to reallocate
> >> + * @align: requested alignment
> >>  * @flags: the flags for the page level allocator
> >> + * @nid: node number of the target node
> >> + *
> >> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
> >> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
> >>  *
> >> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
> >> - * @p is not a %NULL pointer, the object pointed to is freed.
> >> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
> >> + * the given node. If reallocation is not necessary (e. g. the new size is less
> >> + * than the current allocated size), the current allocation will be preserved
> >> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
> >> + * requested node will be attempted.
> > 
> > I am having a very hard time understanding what you mean here.  What is
> > the latter case?
> > 
> > If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
> > node.  Then things sort of get confusing.  What is the latter case?
> 
> The latter case is __GFP_THISNODE present in flags. That’s the latest if-clause in this paragraph.
> > 
> >>  *
> >>  * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
> >>  * initial memory allocation, every subsequent call to this API for the same
> >>  * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
> >>  * __GFP_ZERO is not fully honored by this API.
> >>  *
> >> + * If the requested alignment is bigger than the one the *existing* allocation
> >> + * has, this function will fail.
> >> + *
> > 
> > It might be better to say something like:
> > Requesting an alignment that is bigger than the alignment of the
> > *existing* allocation will fail.
> > 
> 
> The whole function description in fact consists of several if-clauses (some of which are nested) so I am just following the pattern here.
> 

I'm not trying to be difficult but I am trying to nicely say that this
is a mess of words.  It's easier to just go find the code and try and
figure out what is going on instead of decoding what is written in the
description.

The pattern garbage.  I was trying to tell you this nicely, but your
change is better left undocumented than to add the above description.

The wording here is difficult to follow for a native English speaker.

You don't have to keep stating "this function", that wasn't part of the
pattern.

There are too many unnecessary negative statements "if @nis is not
NUMA_NO_NODE" or "if reallocation is not necessary".

The commas indicate that you can rewrite the sentences to be more clear
by stating the facts up front.

Stating "the latter case" is a barrier to non-native English speakers
and is unclear when you have multiple statements crammed into the
preceding sentence.

And it's easy to fix, but apparently you don't want to change it and
would rather follow a broken pattern?

Regards,
Liam




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-10  6:21     ` Vitaly Wool
  2025-07-10 15:09       ` Liam R. Howlett
@ 2025-07-10 15:19       ` Lorenzo Stoakes
  2025-07-10 18:57         ` Vitaly Wool
  1 sibling, 1 reply; 28+ messages in thread
From: Lorenzo Stoakes @ 2025-07-10 15:19 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Liam R. Howlett, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, Vlastimil Babka, rust-for-linux,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

On Thu, Jul 10, 2025 at 08:21:19AM +0200, Vitaly Wool wrote:
>
>
> > On Jul 9, 2025, at 9:01 PM, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Vitaly Wool <vitaly.wool@konsulko.se <mailto:vitaly.wool@konsulko.se>> [250709 13:24]:
> >> Reimplement vrealloc() to be able to set node and alignment should
> >> a user need to do so. Rename the function to vrealloc_node_align()
> >> to better match what it actually does now and introduce macros for
> >> vrealloc() and friends for backward compatibility.
> >>
> >> With that change we also provide the ability for the Rust part of
> >> the kernel to set node and alignment in its allocations.
> >>
> >> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
> >> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >> ---
> >> include/linux/vmalloc.h | 12 +++++++++---
> >> mm/nommu.c              |  3 ++-
> >> mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
> >> 3 files changed, 37 insertions(+), 9 deletions(-)
> >>
> > ...
> >
> >> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >> index 6dbcdceecae1..03dd06097b25 100644
> >> --- a/mm/vmalloc.c
> >> +++ b/mm/vmalloc.c
> >> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
> >> EXPORT_SYMBOL(vzalloc_node_noprof);
> >>
> >> /**
> >> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
> >> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
> >> + * remain unchanged
> >>  * @p: object to reallocate memory for
> >>  * @size: the size to reallocate
> >> + * @align: requested alignment
> >>  * @flags: the flags for the page level allocator
> >> + * @nid: node number of the target node
> >> + *
> >> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
> >> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
> >>  *
> >> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
> >> - * @p is not a %NULL pointer, the object pointed to is freed.
> >> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
> >> + * the given node. If reallocation is not necessary (e. g. the new size is less
> >> + * than the current allocated size), the current allocation will be preserved
> >> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
> >> + * requested node will be attempted.

Agreed with Liam, this is completely unreadable.

I think the numa node stuff is unnecesasry, that's pretty much inferred.

I'd just go with something like 'if the function can void having to reallocate
then it does'.

Nice and simple :)

> >
> > I am having a very hard time understanding what you mean here.  What is
> > the latter case?
> >
> > If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
> > node.  Then things sort of get confusing.  What is the latter case?
>
> The latter case is __GFP_THISNODE present in flags. That’s the latest if-clause in this paragraph.
> >
> >>  *
> >>  * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
> >>  * initial memory allocation, every subsequent call to this API for the same
> >>  * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
> >>  * __GFP_ZERO is not fully honored by this API.
> >>  *
> >> + * If the requested alignment is bigger than the one the *existing* allocation
> >> + * has, this function will fail.
> >> + *
> >
> > It might be better to say something like:
> > Requesting an alignment that is bigger than the alignment of the
> > *existing* allocation will fail.
> >
>
> The whole function description in fact consists of several if-clauses (some of which are nested) so I am just following the pattern here.

Right, but in no sane world is essentially describing a series of if-clauses in
a kerneldoc a thing.

Just it keep it simple, this is meant to be an overview, people can go read the
code if they need details :)

>
> ~Vitaly

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc
  2025-07-10 15:19       ` Lorenzo Stoakes
@ 2025-07-10 18:57         ` Vitaly Wool
  0 siblings, 0 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-10 18:57 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Liam R. Howlett, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, Vlastimil Babka, rust-for-linux,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato



> On Jul 10, 2025, at 5:19 PM, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> 
> On Thu, Jul 10, 2025 at 08:21:19AM +0200, Vitaly Wool wrote:
>> 
>> 
>>> On Jul 9, 2025, at 9:01 PM, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>>> 
>>> * Vitaly Wool <vitaly.wool@konsulko.se <mailto:vitaly.wool@konsulko.se>> [250709 13:24]:
>>>> Reimplement vrealloc() to be able to set node and alignment should
>>>> a user need to do so. Rename the function to vrealloc_node_align()
>>>> to better match what it actually does now and introduce macros for
>>>> vrealloc() and friends for backward compatibility.
>>>> 
>>>> With that change we also provide the ability for the Rust part of
>>>> the kernel to set node and alignment in its allocations.
>>>> 
>>>> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
>>>> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>> ---
>>>> include/linux/vmalloc.h | 12 +++++++++---
>>>> mm/nommu.c              |  3 ++-
>>>> mm/vmalloc.c            | 31 ++++++++++++++++++++++++++-----
>>>> 3 files changed, 37 insertions(+), 9 deletions(-)
>>>> 
>>> ...
>>> 
>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>>> index 6dbcdceecae1..03dd06097b25 100644
>>>> --- a/mm/vmalloc.c
>>>> +++ b/mm/vmalloc.c
>>>> @@ -4089,19 +4089,31 @@ void *vzalloc_node_noprof(unsigned long size, int node)
>>>> EXPORT_SYMBOL(vzalloc_node_noprof);
>>>> 
>>>> /**
>>>> - * vrealloc - reallocate virtually contiguous memory; contents remain unchanged
>>>> + * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents
>>>> + * remain unchanged
>>>> * @p: object to reallocate memory for
>>>> * @size: the size to reallocate
>>>> + * @align: requested alignment
>>>> * @flags: the flags for the page level allocator
>>>> + * @nid: node number of the target node
>>>> + *
>>>> + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If @size is
>>>> + * 0 and @p is not a %NULL pointer, the object pointed to is freed.
>>>> *
>>>> - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size is 0 and
>>>> - * @p is not a %NULL pointer, the object pointed to is freed.
>>>> + * if @nid is not NUMA_NO_NODE, this function will try to allocate memory on
>>>> + * the given node. If reallocation is not necessary (e. g. the new size is less
>>>> + * than the current allocated size), the current allocation will be preserved
>>>> + * unless __GFP_THISNODE is set. In the latter case a new allocation on the
>>>> + * requested node will be attempted.
> 
> Agreed with Liam, this is completely unreadable.
> 
> I think the numa node stuff is unnecesasry, that's pretty much inferred.
> 
> I'd just go with something like 'if the function can void having to reallocate
> then it does'.
> 
> Nice and simple :)

I think it is important to stress that the function is not always following the specified nid.
How about “If the caller wants the new memory to be on specific node *only*, __GFP_THISNODE flag should be set, otherwise the function will try to avoid reallocation and possibly disregard the specified @nid” ?

> 
>>> 
>>> I am having a very hard time understanding what you mean here.  What is
>>> the latter case?
>>> 
>>> If @nis is !NUMA_NO_NODE, the allocation will be attempted on the given
>>> node.  Then things sort of get confusing.  What is the latter case?
>> 
>> The latter case is __GFP_THISNODE present in flags. That’s the latest if-clause in this paragraph.
>>> 
>>>> *
>>>> * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
>>>> * initial memory allocation, every subsequent call to this API for the same
>>>> * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
>>>> * __GFP_ZERO is not fully honored by this API.
>>>> *
>>>> + * If the requested alignment is bigger than the one the *existing* allocation
>>>> + * has, this function will fail.
>>>> + *
>>> 
>>> It might be better to say something like:
>>> Requesting an alignment that is bigger than the alignment of the
>>> *existing* allocation will fail.
>>> 
>> 
>> The whole function description in fact consists of several if-clauses (some of which are nested) so I am just following the pattern here.
> 
> Right, but in no sane world is essentially describing a series of if-clauses in
> a kerneldoc a thing.
> 
> Just it keep it simple, this is meant to be an overview, people can go read the
> code if they need details :)
> 
Alright, no strong feelings about it anyway. Will reword as you guys suggest.

Thanks,
Vitaly



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-09 17:24 ` [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc Vitaly Wool
  2025-07-10  7:45   ` Vlastimil Babka
@ 2025-07-11  8:58   ` Harry Yoo
  2025-07-11 11:11     ` Kent Overstreet
  2025-07-11 15:43     ` Vlastimil Babka
  1 sibling, 2 replies; 28+ messages in thread
From: Harry Yoo @ 2025-07-11  8:58 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, Vlastimil Babka, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
> Reimplement k[v]realloc_node() to be able to set node and
> alignment should a user need to do so. In order to do that while
> retaining the maximal backward compatibility, add
> k[v]realloc_node_align() functions and redefine the rest of API
> using these new ones.
> 
> While doing that, we also keep the number of  _noprof variants to a
> minimum, which implies some changes to the existing users of older
> _noprof functions, that basically being bcachefs.
> 
> With that change we also provide the ability for the Rust part of
> the kernel to set node and alignment in its K[v]xxx
> [re]allocations.
> 
> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
> ---
>  fs/bcachefs/darray.c   |  2 +-
>  fs/bcachefs/util.h     |  2 +-
>  include/linux/bpfptr.h |  2 +-
>  include/linux/slab.h   | 38 +++++++++++++++----------
>  lib/rhashtable.c       |  4 +--
>  mm/slub.c              | 64 +++++++++++++++++++++++++++++-------------
>  6 files changed, 72 insertions(+), 40 deletions(-)
 
> diff --git a/mm/slub.c b/mm/slub.c
> index c4b64821e680..6fad4cdea6c4 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4845,7 +4845,7 @@ void kfree(const void *object)
>  EXPORT_SYMBOL(kfree);
>  
>  static __always_inline __realloc_size(2) void *
> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
>  {
>  	void *ret;
>  	size_t ks = 0;
> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>  	if (!kasan_check_byte(p))
>  		return NULL;
>  
> +	/* refuse to proceed if alignment is bigger than what kmalloc() provides */
> +	if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
> +		return NULL;

Hmm but what happens if `p` is aligned to `align`, but the new object is not?

For example, what will happen if we  allocate object with size=64, align=64
and then do krealloc with size=96, align=64...

Or am I missing something?

> +	/*
> +	 * If reallocation is not necessary (e. g. the new size is less
> +	 * than the current allocated size), the current allocation will be
> +	 * preserved unless __GFP_THISNODE is set. In the latter case a new
> +	 * allocation on the requested node will be attempted.
> +	 */
> +	if (unlikely(flags & __GFP_THISNODE) && nid != NUMA_NO_NODE &&
> +		     nid != page_to_nid(virt_to_page(p)))
> +		goto alloc_new;
> +
>  	if (is_kfence_address(p)) {
>  		ks = orig_size = kfence_ksize(p);
>  	} else {

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-11  8:58   ` Harry Yoo
@ 2025-07-11 11:11     ` Kent Overstreet
  2025-07-11 15:43     ` Vlastimil Babka
  1 sibling, 0 replies; 28+ messages in thread
From: Kent Overstreet @ 2025-07-11 11:11 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Vitaly Wool, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, Vlastimil Babka, rust-for-linux,
	Lorenzo Stoakes, Liam R . Howlett, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On Fri, Jul 11, 2025 at 05:58:23PM +0900, Harry Yoo wrote:
> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
> > Reimplement k[v]realloc_node() to be able to set node and
> > alignment should a user need to do so. In order to do that while
> > retaining the maximal backward compatibility, add
> > k[v]realloc_node_align() functions and redefine the rest of API
> > using these new ones.
> > 
> > While doing that, we also keep the number of  _noprof variants to a
> > minimum, which implies some changes to the existing users of older
> > _noprof functions, that basically being bcachefs.
> > 
> > With that change we also provide the ability for the Rust part of
> > the kernel to set node and alignment in its K[v]xxx
> > [re]allocations.
> > 
> > Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
> > ---
> >  fs/bcachefs/darray.c   |  2 +-
> >  fs/bcachefs/util.h     |  2 +-
> >  include/linux/bpfptr.h |  2 +-
> >  include/linux/slab.h   | 38 +++++++++++++++----------
> >  lib/rhashtable.c       |  4 +--
> >  mm/slub.c              | 64 +++++++++++++++++++++++++++++-------------
> >  6 files changed, 72 insertions(+), 40 deletions(-)
>  
> > diff --git a/mm/slub.c b/mm/slub.c
> > index c4b64821e680..6fad4cdea6c4 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4845,7 +4845,7 @@ void kfree(const void *object)
> >  EXPORT_SYMBOL(kfree);
> >  
> >  static __always_inline __realloc_size(2) void *
> > -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
> > +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
> >  {
> >  	void *ret;
> >  	size_t ks = 0;
> > @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >  	if (!kasan_check_byte(p))
> >  		return NULL;
> >  
> > +	/* refuse to proceed if alignment is bigger than what kmalloc() provides */
> > +	if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
> > +		return NULL;
> 
> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
> 
> For example, what will happen if we  allocate object with size=64, align=64
> and then do krealloc with size=96, align=64...
> 
> Or am I missing something?

You generally need size to be a multiple of align...


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-11  8:58   ` Harry Yoo
  2025-07-11 11:11     ` Kent Overstreet
@ 2025-07-11 15:43     ` Vlastimil Babka
  2025-07-12 12:43       ` Vitaly Wool
  1 sibling, 1 reply; 28+ messages in thread
From: Vlastimil Babka @ 2025-07-11 15:43 UTC (permalink / raw)
  To: Harry Yoo, Vitaly Wool
  Cc: linux-mm, akpm, linux-kernel, Uladzislau Rezki, Danilo Krummrich,
	Alice Ryhl, rust-for-linux, Lorenzo Stoakes, Liam R . Howlett,
	Kent Overstreet, linux-bcachefs, bpf, Herbert Xu, Jann Horn,
	Pedro Falcato

On 7/11/25 10:58, Harry Yoo wrote:
> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
>> Reimplement k[v]realloc_node() to be able to set node and
>> alignment should a user need to do so. In order to do that while
>> retaining the maximal backward compatibility, add
>> k[v]realloc_node_align() functions and redefine the rest of API
>> using these new ones.
>> 
>> While doing that, we also keep the number of  _noprof variants to a
>> minimum, which implies some changes to the existing users of older
>> _noprof functions, that basically being bcachefs.
>> 
>> With that change we also provide the ability for the Rust part of
>> the kernel to set node and alignment in its K[v]xxx
>> [re]allocations.
>> 
>> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
>> ---
>>  fs/bcachefs/darray.c   |  2 +-
>>  fs/bcachefs/util.h     |  2 +-
>>  include/linux/bpfptr.h |  2 +-
>>  include/linux/slab.h   | 38 +++++++++++++++----------
>>  lib/rhashtable.c       |  4 +--
>>  mm/slub.c              | 64 +++++++++++++++++++++++++++++-------------
>>  6 files changed, 72 insertions(+), 40 deletions(-)
>  
>> diff --git a/mm/slub.c b/mm/slub.c
>> index c4b64821e680..6fad4cdea6c4 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4845,7 +4845,7 @@ void kfree(const void *object)
>>  EXPORT_SYMBOL(kfree);
>>  
>>  static __always_inline __realloc_size(2) void *
>> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
>> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
>>  {
>>  	void *ret;
>>  	size_t ks = 0;
>> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>  	if (!kasan_check_byte(p))
>>  		return NULL;
>>  
>> +	/* refuse to proceed if alignment is bigger than what kmalloc() provides */
>> +	if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
>> +		return NULL;
> 
> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
> 
> For example, what will happen if we  allocate object with size=64, align=64
> and then do krealloc with size=96, align=64...
> 
> Or am I missing something?

Good point. We extended the alignment guarantees in commit ad59baa31695
("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding")
for rust in a way that size 96 gives you alignment of 32. It assumes that
rust side will ask for alignments that are power-of-two and sizes that are
multiples of alignment. I think if that assumption is still honored than
this will keep working, but the check added above (is it just a sanity check
or something the rust side relies on?) doesn't seem correct?

>> +	/*
>> +	 * If reallocation is not necessary (e. g. the new size is less
>> +	 * than the current allocated size), the current allocation will be
>> +	 * preserved unless __GFP_THISNODE is set. In the latter case a new
>> +	 * allocation on the requested node will be attempted.
>> +	 */
>> +	if (unlikely(flags & __GFP_THISNODE) && nid != NUMA_NO_NODE &&
>> +		     nid != page_to_nid(virt_to_page(p)))
>> +		goto alloc_new;
>> +
>>  	if (is_kfence_address(p)) {
>>  		ks = orig_size = kfence_ksize(p);
>>  	} else {
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-11 15:43     ` Vlastimil Babka
@ 2025-07-12 12:43       ` Vitaly Wool
  2025-07-14  8:14         ` Vlastimil Babka
  0 siblings, 1 reply; 28+ messages in thread
From: Vitaly Wool @ 2025-07-12 12:43 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Harry Yoo, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato



> On Jul 11, 2025, at 5:43 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> On 7/11/25 10:58, Harry Yoo wrote:
>> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
>>> Reimplement k[v]realloc_node() to be able to set node and
>>> alignment should a user need to do so. In order to do that while
>>> retaining the maximal backward compatibility, add
>>> k[v]realloc_node_align() functions and redefine the rest of API
>>> using these new ones.
>>> 
>>> While doing that, we also keep the number of  _noprof variants to a
>>> minimum, which implies some changes to the existing users of older
>>> _noprof functions, that basically being bcachefs.
>>> 
>>> With that change we also provide the ability for the Rust part of
>>> the kernel to set node and alignment in its K[v]xxx
>>> [re]allocations.
>>> 
>>> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
>>> ---
>>> fs/bcachefs/darray.c   |  2 +-
>>> fs/bcachefs/util.h     |  2 +-
>>> include/linux/bpfptr.h |  2 +-
>>> include/linux/slab.h   | 38 +++++++++++++++----------
>>> lib/rhashtable.c       |  4 +--
>>> mm/slub.c              | 64 +++++++++++++++++++++++++++++-------------
>>> 6 files changed, 72 insertions(+), 40 deletions(-)
>> 
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index c4b64821e680..6fad4cdea6c4 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -4845,7 +4845,7 @@ void kfree(const void *object)
>>> EXPORT_SYMBOL(kfree);
>>> 
>>> static __always_inline __realloc_size(2) void *
>>> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
>>> {
>>> void *ret;
>>> size_t ks = 0;
>>> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>> if (!kasan_check_byte(p))
>>> return NULL;
>>> 
>>> + /* refuse to proceed if alignment is bigger than what kmalloc() provides */
>>> + if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
>>> + return NULL;
>> 
>> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
>> 
>> For example, what will happen if we  allocate object with size=64, align=64
>> and then do krealloc with size=96, align=64...
>> 
>> Or am I missing something?
> 
> Good point. We extended the alignment guarantees in commit ad59baa31695
> ("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding")
> for rust in a way that size 96 gives you alignment of 32. It assumes that
> rust side will ask for alignments that are power-of-two and sizes that are
> multiples of alignment. I think if that assumption is still honored than
> this will keep working, but the check added above (is it just a sanity check
> or something the rust side relies on?) doesn't seem correct?
> 

It is a sanity check and it should have looked like this:

        if (!IS_ALIGNED((unsigned long)p, align) && new_size <= ks)
                return NULL;

and the reasoning for this is the following: if we don’t intend to reallocate (new size is not bigger than the original size), but the user requests a larger alignment, it’s a miss. Does that sound reasonable?

~Vitaly



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-12 12:43       ` Vitaly Wool
@ 2025-07-14  8:14         ` Vlastimil Babka
  2025-07-14 15:27           ` Vitaly Wool
  2025-07-15  7:03           ` Alice Ryhl
  0 siblings, 2 replies; 28+ messages in thread
From: Vlastimil Babka @ 2025-07-14  8:14 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Harry Yoo, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato

On 7/12/25 14:43, Vitaly Wool wrote:
> 
> 
>> On Jul 11, 2025, at 5:43 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> 
>> On 7/11/25 10:58, Harry Yoo wrote:
>>> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
>>>> static __always_inline __realloc_size(2) void *
>>>> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>>> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
>>>> {
>>>> void *ret;
>>>> size_t ks = 0;
>>>> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>>> if (!kasan_check_byte(p))
>>>> return NULL;
>>>> 
>>>> + /* refuse to proceed if alignment is bigger than what kmalloc() provides */
>>>> + if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
>>>> + return NULL;
>>> 
>>> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
>>> 
>>> For example, what will happen if we  allocate object with size=64, align=64
>>> and then do krealloc with size=96, align=64...
>>> 
>>> Or am I missing something?
>> 
>> Good point. We extended the alignment guarantees in commit ad59baa31695
>> ("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding")
>> for rust in a way that size 96 gives you alignment of 32. It assumes that
>> rust side will ask for alignments that are power-of-two and sizes that are
>> multiples of alignment. I think if that assumption is still honored than
>> this will keep working, but the check added above (is it just a sanity check
>> or something the rust side relies on?) doesn't seem correct?
>> 
> 
> It is a sanity check and it should have looked like this:
> 
>         if (!IS_ALIGNED((unsigned long)p, align) && new_size <= ks)
>                 return NULL;
> 
> and the reasoning for this is the following: if we don’t intend to reallocate (new size is not bigger than the original size), but the user requests a larger alignment, it’s a miss. Does that sound reasonable?

So taking a step back indeed the align passed to krealloc is indeed used
only for this check. If it's really just a sanity check, then I'd rather not
add this parameter to krealloc functions at all - kmalloc() itself also
doesn't have it, so it's inconsistent that krealloc() would have it - but
only to return NULL and not e.g. try to reallocate for alignment.

If it's not just a sanity check, it means it's expected that for some
sequence of valid kvrealloc_node_align() calls it can return NULL and then
rely on the fallback to vmalloc. That would be rather wasteful for the cases
like going from 64 to 96 bytes etc. So in that case it would be better if
krealloc did the reallocation, same as in cases when size increases. Of
course it would still have to rely on the documented alignment guarantees
only and not provide anything arbitrary. aligned_size() in
rust/kernel/alloc/allocator.rs is responsible for that, AFAIK.

And I think it's not a sanity check but the latter - if the following is a
valid k(v)realloc sequence (from Rust POV). The individual size+align
combinations AFAIK are, but if it's valid to make them follow one another
them like this, I don't know.

krealloc(size=96, align=32) -> can give object with 32 alignment only
krealloc(size=64, align=64) -> doesn't increase size but wants alignment 64

> ~Vitaly
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-14  8:14         ` Vlastimil Babka
@ 2025-07-14 15:27           ` Vitaly Wool
  2025-07-15  7:03           ` Alice Ryhl
  1 sibling, 0 replies; 28+ messages in thread
From: Vitaly Wool @ 2025-07-14 15:27 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Harry Yoo, linux-mm, akpm, linux-kernel, Uladzislau Rezki,
	Danilo Krummrich, Alice Ryhl, rust-for-linux, Lorenzo Stoakes,
	Liam R . Howlett, Kent Overstreet, linux-bcachefs, bpf,
	Herbert Xu, Jann Horn, Pedro Falcato



On 7/14/25 10:14, Vlastimil Babka wrote:
> On 7/12/25 14:43, Vitaly Wool wrote:
>>
>>
>>> On Jul 11, 2025, at 5:43 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>> On 7/11/25 10:58, Harry Yoo wrote:
>>>> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
>>>>> static __always_inline __realloc_size(2) void *
>>>>> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>>>> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
>>>>> {
>>>>> void *ret;
>>>>> size_t ks = 0;
>>>>> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>>>> if (!kasan_check_byte(p))
>>>>> return NULL;
>>>>>
>>>>> + /* refuse to proceed if alignment is bigger than what kmalloc() provides */
>>>>> + if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
>>>>> + return NULL;
>>>>
>>>> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
>>>>
>>>> For example, what will happen if we  allocate object with size=64, align=64
>>>> and then do krealloc with size=96, align=64...
>>>>
>>>> Or am I missing something?
>>>
>>> Good point. We extended the alignment guarantees in commit ad59baa31695
>>> ("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding")
>>> for rust in a way that size 96 gives you alignment of 32. It assumes that
>>> rust side will ask for alignments that are power-of-two and sizes that are
>>> multiples of alignment. I think if that assumption is still honored than
>>> this will keep working, but the check added above (is it just a sanity check
>>> or something the rust side relies on?) doesn't seem correct?
>>>
>>
>> It is a sanity check and it should have looked like this:
>>
>>          if (!IS_ALIGNED((unsigned long)p, align) && new_size <= ks)
>>                  return NULL;
>>
>> and the reasoning for this is the following: if we don’t intend to reallocate (new size is not bigger than the original size), but the user requests a larger alignment, it’s a miss. Does that sound reasonable?
> 
> So taking a step back indeed the align passed to krealloc is indeed used
> only for this check. If it's really just a sanity check, then I'd rather not
> add this parameter to krealloc functions at all - kmalloc() itself also
> doesn't have it, so it's inconsistent that krealloc() would have it - but
> only to return NULL and not e.g. try to reallocate for alignment.
> 
> If it's not just a sanity check, it means it's expected that for some
> sequence of valid kvrealloc_node_align() calls it can return NULL and then
> rely on the fallback to vmalloc. That would be rather wasteful for the cases
> like going from 64 to 96 bytes etc. So in that case it would be better if
> krealloc did the reallocation, same as in cases when size increases. Of
> course it would still have to rely on the documented alignment guarantees
> only and not provide anything arbitrary. aligned_size() in
> rust/kernel/alloc/allocator.rs is responsible for that, AFAIK.
> 
> And I think it's not a sanity check but the latter - if the following is a
> valid k(v)realloc sequence (from Rust POV). The individual size+align
> combinations AFAIK are, but if it's valid to make them follow one another
> them like this, I don't know.
> 
> krealloc(size=96, align=32) -> can give object with 32 alignment only
> krealloc(size=64, align=64) -> doesn't increase size but wants alignment 64
> 

We should be able to correctly process these. I agree that making such 
cases fall back to vrealloc is suboptimal but it's a technically correct 
behavior. I understand that you would rather have a reallocation on the 
slub side in these cases, so this will look as

           if (!IS_ALIGNED((unsigned long)p, align) && new_size <= ks)
                   goto alloc_new;

I'll modify/retest for the next patchset iteration.

~Vitaly


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc
  2025-07-14  8:14         ` Vlastimil Babka
  2025-07-14 15:27           ` Vitaly Wool
@ 2025-07-15  7:03           ` Alice Ryhl
  1 sibling, 0 replies; 28+ messages in thread
From: Alice Ryhl @ 2025-07-15  7:03 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Vitaly Wool, Harry Yoo, linux-mm, akpm, linux-kernel,
	Uladzislau Rezki, Danilo Krummrich, rust-for-linux,
	Lorenzo Stoakes, Liam R . Howlett, Kent Overstreet,
	linux-bcachefs, bpf, Herbert Xu, Jann Horn, Pedro Falcato

On Mon, Jul 14, 2025 at 10:14 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/12/25 14:43, Vitaly Wool wrote:
> >
> >
> >> On Jul 11, 2025, at 5:43 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> On 7/11/25 10:58, Harry Yoo wrote:
> >>> On Wed, Jul 09, 2025 at 07:24:41PM +0200, Vitaly Wool wrote:
> >>>> static __always_inline __realloc_size(2) void *
> >>>> -__do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >>>> +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid)
> >>>> {
> >>>> void *ret;
> >>>> size_t ks = 0;
> >>>> @@ -4859,6 +4859,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >>>> if (!kasan_check_byte(p))
> >>>> return NULL;
> >>>>
> >>>> + /* refuse to proceed if alignment is bigger than what kmalloc() provides */
> >>>> + if (!IS_ALIGNED((unsigned long)p, align) || new_size < align)
> >>>> + return NULL;
> >>>
> >>> Hmm but what happens if `p` is aligned to `align`, but the new object is not?
> >>>
> >>> For example, what will happen if we  allocate object with size=64, align=64
> >>> and then do krealloc with size=96, align=64...
> >>>
> >>> Or am I missing something?
> >>
> >> Good point. We extended the alignment guarantees in commit ad59baa31695
> >> ("slab, rust: extend kmalloc() alignment guarantees to remove Rust padding")
> >> for rust in a way that size 96 gives you alignment of 32. It assumes that
> >> rust side will ask for alignments that are power-of-two and sizes that are
> >> multiples of alignment. I think if that assumption is still honored than
> >> this will keep working, but the check added above (is it just a sanity check
> >> or something the rust side relies on?) doesn't seem correct?
> >>
> >
> > It is a sanity check and it should have looked like this:
> >
> >         if (!IS_ALIGNED((unsigned long)p, align) && new_size <= ks)
> >                 return NULL;
> >
> > and the reasoning for this is the following: if we don’t intend to reallocate (new size is not bigger than the original size), but the user requests a larger alignment, it’s a miss. Does that sound reasonable?
>
> So taking a step back indeed the align passed to krealloc is indeed used
> only for this check. If it's really just a sanity check, then I'd rather not
> add this parameter to krealloc functions at all - kmalloc() itself also
> doesn't have it, so it's inconsistent that krealloc() would have it - but
> only to return NULL and not e.g. try to reallocate for alignment.
>
> If it's not just a sanity check, it means it's expected that for some
> sequence of valid kvrealloc_node_align() calls it can return NULL and then
> rely on the fallback to vmalloc. That would be rather wasteful for the cases
> like going from 64 to 96 bytes etc. So in that case it would be better if
> krealloc did the reallocation, same as in cases when size increases. Of
> course it would still have to rely on the documented alignment guarantees
> only and not provide anything arbitrary. aligned_size() in
> rust/kernel/alloc/allocator.rs is responsible for that, AFAIK.
>
> And I think it's not a sanity check but the latter - if the following is a
> valid k(v)realloc sequence (from Rust POV). The individual size+align
> combinations AFAIK are, but if it's valid to make them follow one another
> them like this, I don't know.
>
> krealloc(size=96, align=32) -> can give object with 32 alignment only
> krealloc(size=64, align=64) -> doesn't increase size but wants alignment 64

On the Rust side, you specify what the old size/align was, and what
you which size/align you want after the call, and they can be anything
including that combination.

Alice


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-07-15  7:03 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 17:23 [PATCH v12 0/4] support large align and nid in Rust allocators Vitaly Wool
2025-07-09 17:24 ` [PATCH v12 1/4] mm/vmalloc: allow to set node and align in vrealloc Vitaly Wool
2025-07-09 19:01   ` Liam R. Howlett
2025-07-10  6:21     ` Vitaly Wool
2025-07-10 15:09       ` Liam R. Howlett
2025-07-10 15:19       ` Lorenzo Stoakes
2025-07-10 18:57         ` Vitaly Wool
2025-07-10 14:07     ` Vitaly Wool
2025-07-09 22:53   ` Alexei Starovoitov
2025-07-09 22:57     ` Danilo Krummrich
2025-07-09 23:14       ` Alexei Starovoitov
2025-07-09 23:26         ` Danilo Krummrich
2025-07-10  0:39           ` Alexei Starovoitov
2025-07-10  6:16             ` Vitaly Wool
2025-07-10  9:57             ` Danilo Krummrich
2025-07-09 17:24 ` [PATCH v12 2/4] mm/slub: allow to set node and align in k[v]realloc Vitaly Wool
2025-07-10  7:45   ` Vlastimil Babka
2025-07-11  8:58   ` Harry Yoo
2025-07-11 11:11     ` Kent Overstreet
2025-07-11 15:43     ` Vlastimil Babka
2025-07-12 12:43       ` Vitaly Wool
2025-07-14  8:14         ` Vlastimil Babka
2025-07-14 15:27           ` Vitaly Wool
2025-07-15  7:03           ` Alice Ryhl
2025-07-09 17:24 ` [PATCH v12 3/4] rust: add support for NUMA ids in allocations Vitaly Wool
2025-07-09 20:58   ` Danilo Krummrich
2025-07-09 17:25 ` [PATCH v12 4/4] rust: support large alignments " Vitaly Wool
2025-07-09 21:00   ` Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).