* Re: [PATCH v3 2/2] slab: fix kernel-docs for mm-api [not found] ` <CANpmjNN_=g31Eoa+w1NrFALfp1dDBi5oHEZdr_bA_48-tS2M=Q@mail.gmail.com> @ 2026-05-04 15:00 ` Marco Elver 2026-05-11 12:07 ` Vlastimil Babka (SUSE) 2026-05-11 12:19 ` Jonathan Corbet 0 siblings, 2 replies; 12+ messages in thread From: Marco Elver @ 2026-05-04 15:00 UTC (permalink / raw) To: Vlastimil Babka (SUSE) Cc: Andrew Morton, Jonathan Corbet, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, linux-doc@vger.kernel.org On Thu, Apr 30, 2026 at 03:59PM +0200, Marco Elver wrote: > On Thu, 30 Apr 2026 at 15:40, Vlastimil Babka (SUSE) <vbabka@kernel.org> wrote: > > > > On 4/24/26 15:24, Marco Elver wrote: > > > The mm-api kernel-doc comments have been broken for a while, as many > > > documented symbols shifted from being direct function definitions to > > > macros wrapping _noprof implementations during the introduction of > > > allocation tagging (starting with commit 7bd230a26648 "mm/slab: enable > > > slab allocation tagging for kmalloc and friends"). > > > > > > When the kernel-doc block remains above the internal implementation > > > function but uses the public API name, the documentation generator fails > > > to associate the documented symbol and generates warnings and fails to > > > emit the documentation. > > > > > > Fix this by: > > > > > > 1. Moving the kernel-doc comment blocks from slub.c to slab.h, placing > > > them directly above the user-facing macros. > > > > > > 2. Converting the variadic macros for the documented APIs to use > > > explicit arguments. > > > > > > No functional change intended. > > > > > > Signed-off-by: Marco Elver <elver@google.com> > > > > +Cc Jon > > > > I thought it was supposed to work because the kernel-doc scripts were at the > > time taught by commit 51a7bf0238c2 ("scripts/kernel-doc: drop "_noprof" on > > function prototypes") to handle _noprof. In the current form git grep finds: > > > > tools/lib/python/kdoc/kdoc_parser.py: suffixes = [ '_noprof' ] > > tools/lib/python/kdoc/xforms_lists.py: (KernRe("_noprof"), ""), > > > > Doesn't it work for you then? > > Ah, I see. So it doesn't work anymore because we add the '_' prefix, too. > > I guess the question is if we want to proliferate more kdoc parser > special cases, or just move the docs to the macros. The downside of > macros is that they lose the types in the displayed function > signature. > > Preferences? How about the below, i.e. adding type decls that only the kernel-doc parser sees? One complication is also DECL_KMALLOC_PARAMS, and adding kernel-doc parser hacks for that looks pretty awful, so this is a lot cleaner. ------ >8 ------ From: Marco Elver <elver@google.com> Date: Tue, 21 Apr 2026 13:48:21 +0200 Subject: [PATCH] slab: fix kernel-docs for mm-api The mm-api kernel-docs have been disconnected from their symbols. While the scripts were previously taught to handle the _noprof suffix added by allocation tagging (in 51a7bf0238c2 "scripts/kernel-doc: drop "_noprof" on function prototypes"), this does not handle cases where the internal implementation function has an additional leading underscore. The added optional parameters (via DECL_KMALLOC_PARAMS) further complicate parsing the internal signatures. When the kernel-doc block remains above the internal implementation function but uses the public API name, the documentation generator fails to associate the documented symbol. Simply moving the docs to the macros in slab.h fixes the association but causes loss of types in the generated documentation (rendering as e.g. untyped 'kmalloc(size, flags)' macro). Fix this by: 1. Moving the kernel-doc comment blocks from slub.c to slab.h, placing them directly above the user-facing macros. 2. Providing explicit, typed C prototypes for the documented APIs inside '#if 0 /* kernel-doc */' blocks. 3. Converting the variadic macros for the documented APIs to use explicit arguments to match the documentation. No functional change intended. Signed-off-by: Marco Elver <elver@google.com> --- v4: * Provide typed C prototypes in '#if 0' blocks to properly render API. --- include/linux/slab.h | 216 +++++++++++++++++++++++++++++++++---------- mm/slub.c | 98 -------------------- 2 files changed, 168 insertions(+), 146 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index c232f8a10af6..5e1249e36b0d 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -530,7 +530,49 @@ void * __must_check krealloc_node_align_noprof(const void *objp, unsigned long align, gfp_t flags, int nid) __realloc_size(2); #define krealloc_noprof(_o, _s, _f) krealloc_node_align_noprof(_o, PASS_TOKEN_PARAMS(_s, __kmalloc_token(_s)), 1, _f, NUMA_NO_NODE) -#define krealloc_node_align(_o, _s, _a, _f, _n) alloc_hooks(krealloc_node_align_noprof(_o, PASS_TOKEN_PARAMS(_s, __kmalloc_token(_s)), _a, _f, _n)) +#if 0 /* kernel-doc */ +/** + * krealloc_node_align - reallocate memory. The contents will remain unchanged. + * @p: object to reallocate memory for. + * @new_size: how many bytes of memory are required. + * @align: desired alignment. + * @flags: the type of memory to allocate. + * @nid: NUMA node or NUMA_NO_NODE + * + * If @p is %NULL, krealloc() behaves exactly like kmalloc(). If @new_size + * is 0 and @p is not a %NULL pointer, the object pointed to is freed. + * + * Only alignments up to those guaranteed by kmalloc() will be honored. Please see + * Documentation/core-api/memory-allocation.rst for more details. + * + * If __GFP_ZERO logic is requested, callers must ensure that, starting with the + * initial memory allocation, every subsequent call to this API for the same + * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that + * __GFP_ZERO is not fully honored by this API. + * + * When slub_debug_orig_size() is off, krealloc() only knows about the bucket + * size of an allocation (but not the exact size it was allocated with) and + * hence implements the following semantics for shrinking and growing buffers + * with __GFP_ZERO:: + * + * new bucket + * 0 size size + * |--------|----------------| + * | keep | zero | + * + * Otherwise, the original allocation size 'orig_size' could be used to + * precisely clear the requested size, and the new size will also be stored + * as the new 'orig_size'. + * + * In any case, the contents of the object pointed to are preserved up to the + * lesser of the new and old sizes. + * + * Return: pointer to the allocated memory or %NULL in case of error + */ +void *krealloc_node_align(const void *p, size_t new_size, unsigned long align, gfp_t flags, int nid); +#endif +#define krealloc_node_align(p, new_size, align, flags, nid) \ + alloc_hooks(krealloc_node_align_noprof(p, PASS_TOKEN_PARAMS(new_size, __kmalloc_token(new_size)), align, flags, nid)) #define krealloc_node(_o, _s, _f, _n) krealloc_node_align(_o, _s, 1, _f, _n) #define krealloc(...) krealloc_node(__VA_ARGS__, NUMA_NO_NODE) @@ -913,6 +955,23 @@ void *__kmalloc_large_noprof(size_t size, gfp_t flags) void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node) __assume_page_alignment __alloc_size(1); +static __always_inline __alloc_size(1) void *_kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token) +{ + if (__builtin_constant_p(size) && size) { + unsigned int index; + + if (size > KMALLOC_MAX_CACHE_SIZE) + return __kmalloc_large_noprof(size, flags); + + index = kmalloc_index(size); + return __kmalloc_cache_noprof( + kmalloc_caches[kmalloc_type(flags, token)][index], + flags, size); + } + return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); +} +#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) +#if 0 /* kernel-doc */ /** * kmalloc - allocate kernel memory * @size: how many bytes of memory are required. @@ -968,27 +1027,27 @@ void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node) * Try really hard to succeed the allocation but fail * eventually. */ -static __always_inline __alloc_size(1) void *_kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token) -{ - if (__builtin_constant_p(size) && size) { - unsigned int index; - - if (size > KMALLOC_MAX_CACHE_SIZE) - return __kmalloc_large_noprof(size, flags); - - index = kmalloc_index(size); - return __kmalloc_cache_noprof( - kmalloc_caches[kmalloc_type(flags, token)][index], - flags, size); - } - return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); -} -#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) -#define kmalloc(...) alloc_hooks(kmalloc_noprof(__VA_ARGS__)) +void *kmalloc(size_t size, gfp_t flags); +#endif +#define kmalloc(size, flags) alloc_hooks(kmalloc_noprof(size, flags)) void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node); #define kmalloc_nolock_noprof(_s, _f, _n) _kmalloc_nolock_noprof(PASS_TOKEN_PARAMS(_s, __kmalloc_token(_s)), _f, _n) -#define kmalloc_nolock(...) alloc_hooks(kmalloc_nolock_noprof(__VA_ARGS__)) +#if 0 /* kernel-doc */ +/** + * kmalloc_nolock - Allocate an object of given size from any context. + * @size: size to allocate + * @gfp_flags: GFP flags. Only __GFP_ACCOUNT, __GFP_ZERO, __GFP_NO_OBJ_EXT + * allowed. + * @node: node number of the target node. + * + * Return: pointer to the new object or NULL in case of error. + * NULL does not mean EBUSY or EAGAIN. It means ENOMEM. + * There is no reason to call it again and expect !NULL. + */ +void *kmalloc_nolock(size_t size, gfp_t gfp_flags, int node); +#endif +#define kmalloc_nolock(size, gfp_flags, node) alloc_hooks(kmalloc_nolock_noprof(size, gfp_flags, node)) /** * __alloc_objs - Allocate objects of a given type using @@ -1115,23 +1174,40 @@ static __always_inline __alloc_size(1) void *_kmalloc_node_noprof(size_t size, g #define kmalloc_node_noprof(...) _kmalloc_node_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) #define kmalloc_node(...) alloc_hooks(kmalloc_node_noprof(__VA_ARGS__)) +static inline __alloc_size(1, 2) void *_kmalloc_array_noprof(size_t n, size_t size, gfp_t flags, kmalloc_token_t token) +{ + size_t bytes; + + if (unlikely(check_mul_overflow(n, size, &bytes))) + return NULL; + return _kmalloc_noprof(bytes, flags, token); +} +#define kmalloc_array_noprof(...) _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) +#if 0 /* kernel-doc */ /** * kmalloc_array - allocate memory for an array. * @n: number of elements. * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ -static inline __alloc_size(1, 2) void *_kmalloc_array_noprof(size_t n, size_t size, gfp_t flags, kmalloc_token_t token) +void *kmalloc_array(size_t n, size_t size, gfp_t flags); +#endif +#define kmalloc_array(n, size, flags) alloc_hooks(kmalloc_array_noprof(n, size, flags)) + +static inline __realloc_size(2, 3) void * __must_check _krealloc_array_noprof(void *p, + size_t new_n, + size_t new_size, + gfp_t flags, kmalloc_token_t token) { size_t bytes; - if (unlikely(check_mul_overflow(n, size, &bytes))) + if (unlikely(check_mul_overflow(new_n, new_size, &bytes))) return NULL; - return _kmalloc_noprof(bytes, flags, token); -} -#define kmalloc_array_noprof(...) _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) -#define kmalloc_array(...) alloc_hooks(kmalloc_array_noprof(__VA_ARGS__)) + return krealloc_node_align_noprof(p, PASS_TOKEN_PARAMS(bytes, token), 1, flags, NUMA_NO_NODE); +} +#define krealloc_array_noprof(...) _krealloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) +#if 0 /* kernel-doc */ /** * krealloc_array - reallocate memory for an array. * @p: pointer to the memory chunk to reallocate @@ -1149,20 +1225,9 @@ static inline __alloc_size(1, 2) void *_kmalloc_array_noprof(size_t n, size_t si * In any case, the contents of the object pointed to are preserved up to the * lesser of the new and old sizes. */ -static inline __realloc_size(2, 3) void * __must_check _krealloc_array_noprof(void *p, - size_t new_n, - size_t new_size, - gfp_t flags, kmalloc_token_t token) -{ - size_t bytes; - - if (unlikely(check_mul_overflow(new_n, new_size, &bytes))) - return NULL; - - return krealloc_node_align_noprof(p, PASS_TOKEN_PARAMS(bytes, token), 1, flags, NUMA_NO_NODE); -} -#define krealloc_array_noprof(...) _krealloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) -#define krealloc_array(...) alloc_hooks(krealloc_array_noprof(__VA_ARGS__)) +void *krealloc_array(void *p, size_t new_n, size_t new_size, gfp_t flags); +#endif +#define krealloc_array(p, new_n, new_size, flags) alloc_hooks(krealloc_array_noprof(p, new_n, new_size, flags)) /** * kcalloc - allocate memory for an array. The memory is set to zero. @@ -1214,17 +1279,20 @@ static inline __alloc_size(1, 2) void *_kmalloc_array_node_noprof(size_t n, size */ #define kmem_cache_zalloc(_k, _flags) kmem_cache_alloc(_k, (_flags)|__GFP_ZERO) -/** - * kzalloc - allocate memory. The memory is set to zero. - * @size: how many bytes of memory are required. - * @flags: the type of memory to allocate (see kmalloc). - */ static inline __alloc_size(1) void *_kzalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token) { return _kmalloc_noprof(size, flags | __GFP_ZERO, token); } #define kzalloc_noprof(...) _kzalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) -#define kzalloc(...) alloc_hooks(kzalloc_noprof(__VA_ARGS__)) +#if 0 /* kernel-doc */ +/** + * kzalloc - allocate memory. The memory is set to zero. + * @size: how many bytes of memory are required. + * @flags: the type of memory to allocate (see kmalloc). + */ +void *kzalloc(size_t size, gfp_t flags); +#endif +#define kzalloc(size, flags) alloc_hooks(kzalloc_noprof(size, flags)) #define kzalloc_node(_size, _flags, _node) kmalloc_node(_size, (_flags)|__GFP_ZERO, _node) void *__kvmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), unsigned long align, @@ -1233,7 +1301,29 @@ void *__kvmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), unsigned long __kvmalloc_node_noprof(PASS_KMALLOC_PARAMS(_size, NULL, __kmalloc_token(_size)), _align, _flags, _node) #define kvmalloc_node_align(...) \ alloc_hooks(kvmalloc_node_align_noprof(__VA_ARGS__)) -#define kvmalloc_node(_s, _f, _n) kvmalloc_node_align(_s, 1, _f, _n) +#if 0 /* kernel-doc */ +/** + * kvmalloc_node - attempt to allocate physically contiguous memory, but upon + * failure, fall back to non-contiguous (vmalloc) allocation. + * @size: size of the request. + * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL. + * @node: numa node to allocate from + * + * Only alignments up to those guaranteed by kmalloc() will be honored. Please see + * Documentation/core-api/memory-allocation.rst for more details. + * + * Uses kmalloc to get the memory but if the allocation fails then falls back + * to the vmalloc allocator. Use kvfree for freeing the memory. + * + * GFP_NOWAIT and GFP_ATOMIC are supported, the __GFP_NORETRY modifier is not. + * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is + * preferable to the vmalloc fallback, due to visible performance drawbacks. + * + * Return: pointer to the allocated memory of %NULL in case of failure + */ +void *kvmalloc_node(size_t size, gfp_t flags, int node); +#endif +#define kvmalloc_node(size, flags, node) kvmalloc_node_align(size, 1, flags, node) #define kvmalloc_node_noprof(size, flags, node) \ kvmalloc_node_align_noprof(size, 1, flags, node) #define kvmalloc(...) kvmalloc_node(__VA_ARGS__, NUMA_NO_NODE) @@ -1266,8 +1356,38 @@ _kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node, kmallo void *kvrealloc_node_align_noprof(const void *p, DECL_TOKEN_PARAMS(size, token), unsigned long align, gfp_t flags, int nid) __realloc_size(2); -#define kvrealloc_node_align(_p, _s, _a, _f, _n) \ - alloc_hooks(kvrealloc_node_align_noprof(_p, PASS_TOKEN_PARAMS(_s, __kmalloc_token(_s)), _a, _f, _n)) +#if 0 /* kernel-doc */ +/** + * kvrealloc_node_align - reallocate memory; contents remain unchanged + * @p: object to reallocate memory for + * @size: the size to reallocate + * @align: desired alignment + * @flags: the flags for the page level allocator + * @nid: NUMA node id + * + * If @p is %NULL, kvrealloc() behaves exactly like kvmalloc(). If @size is 0 + * and @p is not a %NULL pointer, the object pointed to is freed. + * + * Only alignments up to those guaranteed by kmalloc() will be honored. Please see + * Documentation/core-api/memory-allocation.rst for more details. + * + * If __GFP_ZERO logic is requested, callers must ensure that, starting with the + * initial memory allocation, every subsequent call to this API for the same + * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that + * __GFP_ZERO is not fully honored by this API. + * + * In any case, the contents of the object pointed to are preserved up to the + * lesser of the new and old sizes. + * + * This function must not be called concurrently with itself or kvfree() for the + * same memory allocation. + * + * Return: pointer to the allocated memory or %NULL in case of error + */ +void *kvrealloc_node_align(const void *p, size_t size, unsigned long align, gfp_t flags, int nid); +#endif +#define kvrealloc_node_align(p, size, align, flags, nid) \ + alloc_hooks(kvrealloc_node_align_noprof(p, PASS_TOKEN_PARAMS(size, __kmalloc_token(size)), align, flags, nid)) #define kvrealloc_node(_p, _s, _f, _n) kvrealloc_node_align(_p, _s, 1, _f, _n) #define kvrealloc(...) kvrealloc_node(__VA_ARGS__, NUMA_NO_NODE) diff --git a/mm/slub.c b/mm/slub.c index ccb208cfbecd..dbc3c947e5be 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5310,17 +5310,6 @@ void *__kmalloc_noprof(DECL_KMALLOC_PARAMS(size, b, token), gfp_t flags) } EXPORT_SYMBOL(__kmalloc_noprof); -/** - * kmalloc_nolock - Allocate an object of given size from any context. - * @size: size to allocate - * @gfp_flags: GFP flags. Only __GFP_ACCOUNT, __GFP_ZERO, __GFP_NO_OBJ_EXT - * allowed. - * @node: node number of the target node. - * - * Return: pointer to the new object or NULL in case of error. - * NULL does not mean EBUSY or EAGAIN. It means ENOMEM. - * There is no reason to call it again and expect !NULL. - */ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node) { gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags; @@ -6717,44 +6706,6 @@ __do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t flags, return ret; } -/** - * krealloc_node_align - reallocate memory. The contents will remain unchanged. - * @p: object to reallocate memory for. - * @new_size: how many bytes of memory are required. - * @align: desired alignment. - * @flags: the type of memory to allocate. - * @nid: NUMA node or NUMA_NO_NODE - * - * If @p is %NULL, krealloc() behaves exactly like kmalloc(). If @new_size - * is 0 and @p is not a %NULL pointer, the object pointed to is freed. - * - * Only alignments up to those guaranteed by kmalloc() will be honored. Please see - * Documentation/core-api/memory-allocation.rst for more details. - * - * If __GFP_ZERO logic is requested, callers must ensure that, starting with the - * initial memory allocation, every subsequent call to this API for the same - * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that - * __GFP_ZERO is not fully honored by this API. - * - * When slub_debug_orig_size() is off, krealloc() only knows about the bucket - * size of an allocation (but not the exact size it was allocated with) and - * hence implements the following semantics for shrinking and growing buffers - * with __GFP_ZERO:: - * - * new bucket - * 0 size size - * |--------|----------------| - * | keep | zero | - * - * Otherwise, the original allocation size 'orig_size' could be used to - * precisely clear the requested size, and the new size will also be stored - * as the new 'orig_size'. - * - * In any case, the contents of the object pointed to are preserved up to the - * lesser of the new and old sizes. - * - * Return: pointer to the allocated memory or %NULL in case of error - */ void *krealloc_node_align_noprof(const void *p, DECL_TOKEN_PARAMS(new_size, token), unsigned long align, gfp_t flags, int nid) { @@ -6797,28 +6748,6 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) return flags; } -/** - * __kvmalloc_node - attempt to allocate physically contiguous memory, but upon - * failure, fall back to non-contiguous (vmalloc) allocation. - * @size: size of the request. - * @b: which set of kmalloc buckets to allocate from. - * @token: allocation token. - * @align: desired alignment. - * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL. - * @node: numa node to allocate from - * - * Only alignments up to those guaranteed by kmalloc() will be honored. Please see - * Documentation/core-api/memory-allocation.rst for more details. - * - * Uses kmalloc to get the memory but if the allocation fails then falls back - * to the vmalloc allocator. Use kvfree for freeing the memory. - * - * GFP_NOWAIT and GFP_ATOMIC are supported, the __GFP_NORETRY modifier is not. - * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is - * preferable to the vmalloc fallback, due to visible performance drawbacks. - * - * Return: pointer to the allocated memory of %NULL in case of failure - */ void *__kvmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), unsigned long align, gfp_t flags, int node) { @@ -6900,33 +6829,6 @@ void kvfree_sensitive(const void *addr, size_t len) } EXPORT_SYMBOL(kvfree_sensitive); -/** - * kvrealloc_node_align - reallocate memory; contents remain unchanged - * @p: object to reallocate memory for - * @size: the size to reallocate - * @align: desired alignment - * @flags: the flags for the page level allocator - * @nid: NUMA node id - * - * If @p is %NULL, kvrealloc() behaves exactly like kvmalloc(). If @size is 0 - * and @p is not a %NULL pointer, the object pointed to is freed. - * - * Only alignments up to those guaranteed by kmalloc() will be honored. Please see - * Documentation/core-api/memory-allocation.rst for more details. - * - * If __GFP_ZERO logic is requested, callers must ensure that, starting with the - * initial memory allocation, every subsequent call to this API for the same - * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that - * __GFP_ZERO is not fully honored by this API. - * - * In any case, the contents of the object pointed to are preserved up to the - * lesser of the new and old sizes. - * - * This function must not be called concurrently with itself or kvfree() for the - * same memory allocation. - * - * Return: pointer to the allocated memory or %NULL in case of error - */ void *kvrealloc_node_align_noprof(const void *p, DECL_TOKEN_PARAMS(size, token), unsigned long align, gfp_t flags, int nid) { -- 2.54.0.545.g6539524ca2-goog ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/2] slab: fix kernel-docs for mm-api 2026-05-04 15:00 ` [PATCH v3 2/2] slab: fix kernel-docs for mm-api Marco Elver @ 2026-05-11 12:07 ` Vlastimil Babka (SUSE) 2026-05-11 12:19 ` Jonathan Corbet 1 sibling, 0 replies; 12+ messages in thread From: Vlastimil Babka (SUSE) @ 2026-05-11 12:07 UTC (permalink / raw) To: Marco Elver, Jonathan Corbet Cc: Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, linux-doc@vger.kernel.org On 5/4/26 17:00, Marco Elver wrote: > On Thu, Apr 30, 2026 at 03:59PM +0200, Marco Elver wrote: >> On Thu, 30 Apr 2026 at 15:40, Vlastimil Babka (SUSE) <vbabka@kernel.org> wrote: >> > >> > On 4/24/26 15:24, Marco Elver wrote: >> > > The mm-api kernel-doc comments have been broken for a while, as many >> > > documented symbols shifted from being direct function definitions to >> > > macros wrapping _noprof implementations during the introduction of >> > > allocation tagging (starting with commit 7bd230a26648 "mm/slab: enable >> > > slab allocation tagging for kmalloc and friends"). >> > > >> > > When the kernel-doc block remains above the internal implementation >> > > function but uses the public API name, the documentation generator fails >> > > to associate the documented symbol and generates warnings and fails to >> > > emit the documentation. >> > > >> > > Fix this by: >> > > >> > > 1. Moving the kernel-doc comment blocks from slub.c to slab.h, placing >> > > them directly above the user-facing macros. >> > > >> > > 2. Converting the variadic macros for the documented APIs to use >> > > explicit arguments. >> > > >> > > No functional change intended. >> > > >> > > Signed-off-by: Marco Elver <elver@google.com> >> > >> > +Cc Jon >> > >> > I thought it was supposed to work because the kernel-doc scripts were at the >> > time taught by commit 51a7bf0238c2 ("scripts/kernel-doc: drop "_noprof" on >> > function prototypes") to handle _noprof. In the current form git grep finds: >> > >> > tools/lib/python/kdoc/kdoc_parser.py: suffixes = [ '_noprof' ] >> > tools/lib/python/kdoc/xforms_lists.py: (KernRe("_noprof"), ""), >> > >> > Doesn't it work for you then? >> >> Ah, I see. So it doesn't work anymore because we add the '_' prefix, too. >> >> I guess the question is if we want to proliferate more kdoc parser >> special cases, or just move the docs to the macros. The downside of >> macros is that they lose the types in the displayed function >> signature. >> >> Preferences? > > How about the below, i.e. adding type decls that only the kernel-doc > parser sees? One complication is also DECL_KMALLOC_PARAMS, and adding > kernel-doc parser hacks for that looks pretty awful, so this is a lot > cleaner. Looks like a good workaround to me, unless something gets confused by seeing both the declaration and the define. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/2] slab: fix kernel-docs for mm-api 2026-05-04 15:00 ` [PATCH v3 2/2] slab: fix kernel-docs for mm-api Marco Elver 2026-05-11 12:07 ` Vlastimil Babka (SUSE) @ 2026-05-11 12:19 ` Jonathan Corbet 2026-05-11 16:34 ` Marco Elver 1 sibling, 1 reply; 12+ messages in thread From: Jonathan Corbet @ 2026-05-11 12:19 UTC (permalink / raw) To: Marco Elver, Vlastimil Babka (SUSE) Cc: Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, linux-doc@vger.kernel.org Marco Elver <elver@google.com> writes: > How about the below, i.e. adding type decls that only the kernel-doc > parser sees? One complication is also DECL_KMALLOC_PARAMS, and adding > kernel-doc parser hacks for that looks pretty awful, so this is a lot > cleaner. I'm going to be a while catching up with things, so this is just a first take. I strongly suspect that the people who object so strongly to documentation markup in general would be less than fully thrilled by the addition of this kind of workaround. I'd like to ponder a bit and see if I can some up with something better...but again, it won't happen right away. Thanks, jon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/2] slab: fix kernel-docs for mm-api 2026-05-11 12:19 ` Jonathan Corbet @ 2026-05-11 16:34 ` Marco Elver 0 siblings, 0 replies; 12+ messages in thread From: Marco Elver @ 2026-05-11 16:34 UTC (permalink / raw) To: Jonathan Corbet Cc: Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, linux-doc@vger.kernel.org On Mon, 11 May 2026 at 14:19, Jonathan Corbet <corbet@lwn.net> wrote: > > Marco Elver <elver@google.com> writes: > > > How about the below, i.e. adding type decls that only the kernel-doc > > parser sees? One complication is also DECL_KMALLOC_PARAMS, and adding > > kernel-doc parser hacks for that looks pretty awful, so this is a lot > > cleaner. > > I'm going to be a while catching up with things, so this is just a first > take. I strongly suspect that the people who object so strongly to > documentation markup in general would be less than fully thrilled by the > addition of this kind of workaround. I'd like to ponder a bit and see > if I can some up with something better...but again, it won't happen > right away. Thanks, Jon. The technical debt (hacks) that would accumulate in the kernel-doc parser just for accommodating slab.h might be quite high; slab.h is rather special, so the "#if 0" solution might be justified. Then again, if there's a reasonable kernel-doc parser solution, that also helps in other places, I won't object. I tested the "#if 0" version and it works as expected. So to move forward, we could consider it in the short term, and in the longer term, see what new powers the kernel-doc parser can provide. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <6f2bd63a-dc02-4631-a3a5-7ec8e58a4a4e@kernel.org>]
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning [not found] ` <6f2bd63a-dc02-4631-a3a5-7ec8e58a4a4e@kernel.org> @ 2026-05-04 21:22 ` Marco Elver 2026-05-06 13:03 ` Marco Elver 0 siblings, 1 reply; 12+ messages in thread From: Marco Elver @ 2026-05-04 21:22 UTC (permalink / raw) To: Vlastimil Babka (SUSE) Cc: Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Thu, Apr 30, 2026 at 03:03PM +0200, Vlastimil Babka (SUSE) wrote: > On 4/24/26 15:24, Marco Elver wrote: > > > @@ -938,7 +968,7 @@ void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node) > > * Try really hard to succeed the allocation but fail > > * eventually. > > */ > > -static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t flags) > > +static __always_inline __alloc_size(1) void *_kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token) > > { > > if (__builtin_constant_p(size) && size) { > > unsigned int index; > > @@ -948,14 +978,16 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f > > > > index = kmalloc_index(size); > > return __kmalloc_cache_noprof( > > - kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], > > + kmalloc_caches[kmalloc_type(flags, token)][index], > > While reviewing this, it occured to me we might have been using _RET_IP_ > here in a suboptimal way ever since this was introduced. Since this is all > inlined, shouldn't have we been using _THIS_IP_ to really randomize using > the kmalloc() callsite, and not its parent? > > And after this patch, we get the token passed to _kmalloc_noprof()... > > > flags, size); > > } > > - return __kmalloc_noprof(size, flags); > > + return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); > > ... and used also here for the non-constant-size, where previously > __kmalloc_noprof() (not inline function) would correctly use _RET_IP_ on its > own ... > > > } > > +#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > > ... and the token comes from here. With random partitioning that's > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > so that AFAIK makes the situation worse as now the cases without constant > size also start randomizing by the parent callsite and not the kmalloc callsite. > > But there are many users of __kmalloc_token() and maybe some are corrent in > using _RET_IP_, I haven't checked, maybe we'll need two variants, or further > change things around. Good catch. I don't think we need multiple variants (otherwise the TYPED variant would be broken) - we're moving token generation to the callers (not even inlined anymore) with all this macro magic. I think this is all we need: --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); typedef struct { unsigned long v; } kmalloc_token_t; #ifdef CONFIG_KMALLOC_PARTITION_RANDOM extern unsigned long random_kmalloc_seed; -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) #endif Plus a paragraph in the commit message. Let me add that. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-04 21:22 ` [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning Marco Elver @ 2026-05-06 13:03 ` Marco Elver 2026-05-07 9:38 ` Nathan Chancellor 2026-05-07 21:49 ` Harry Yoo (Oracle) 0 siblings, 2 replies; 12+ messages in thread From: Marco Elver @ 2026-05-06 13:03 UTC (permalink / raw) To: Vlastimil Babka (SUSE) Cc: Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Mon, 4 May 2026 at 23:23, Marco Elver <elver@google.com> wrote: > > On Thu, Apr 30, 2026 at 03:03PM +0200, Vlastimil Babka (SUSE) wrote: > > On 4/24/26 15:24, Marco Elver wrote: > > > > > @@ -938,7 +968,7 @@ void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node) > > > * Try really hard to succeed the allocation but fail > > > * eventually. > > > */ > > > -static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t flags) > > > +static __always_inline __alloc_size(1) void *_kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token) > > > { > > > if (__builtin_constant_p(size) && size) { > > > unsigned int index; > > > @@ -948,14 +978,16 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f > > > > > > index = kmalloc_index(size); > > > return __kmalloc_cache_noprof( > > > - kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], > > > + kmalloc_caches[kmalloc_type(flags, token)][index], > > > > While reviewing this, it occured to me we might have been using _RET_IP_ > > here in a suboptimal way ever since this was introduced. Since this is all > > inlined, shouldn't have we been using _THIS_IP_ to really randomize using > > the kmalloc() callsite, and not its parent? > > > > And after this patch, we get the token passed to _kmalloc_noprof()... > > > > > flags, size); > > > } > > > - return __kmalloc_noprof(size, flags); > > > + return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); > > > > ... and used also here for the non-constant-size, where previously > > __kmalloc_noprof() (not inline function) would correctly use _RET_IP_ on its > > own ... > > > > > } > > > +#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > > > > ... and the token comes from here. With random partitioning that's > > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > > > so that AFAIK makes the situation worse as now the cases without constant > > size also start randomizing by the parent callsite and not the kmalloc callsite. > > > > But there are many users of __kmalloc_token() and maybe some are corrent in > > using _RET_IP_, I haven't checked, maybe we'll need two variants, or further > > change things around. > > Good catch. I don't think we need multiple variants (otherwise the TYPED > variant would be broken) - we're moving token generation to the callers > (not even inlined anymore) with all this macro magic. > > I think this is all we need: > > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); > typedef struct { unsigned long v; } kmalloc_token_t; > #ifdef CONFIG_KMALLOC_PARTITION_RANDOM > extern unsigned long random_kmalloc_seed; > -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) > #endif > > Plus a paragraph in the commit message. Let me add that. Bah, this is why it doesn't work: >> drivers/gpu/drm/msm/msm_gpu.c:272:4: error: cannot jump from this indirect goto statement to one of its possible targets 272 | drm_exec_retry_on_contention(&exec); | ^ include/drm/drm_exec.h:123:4: note: expanded from macro 'drm_exec_retry_on_contention' 123 | goto *__drm_exec_retry_ptr; \ | ^ drivers/gpu/drm/msm/msm_gpu.c:304:16: note: possible target of indirect goto statement 304 | state->bos = kcalloc(submit->nr_bos, | ^ include/linux/slab.h:1173:34: note: expanded from macro 'kcalloc' 1173 | #define kcalloc(n, size, flags) kmalloc_array(n, size, (flags) | __GFP_ZERO) | ^ include/linux/slab.h:1133:42: note: expanded from macro 'kmalloc_array' 1133 | #define kmalloc_array(...) alloc_hooks(kmalloc_array_noprof(__VA_ARGS__)) | ^ include/linux/slab.h:1132:71: note: expanded from macro 'kmalloc_array_noprof' 1132 | #define kmalloc_array_noprof(...) _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) | ^ include/linux/slab.h:506:55: note: expanded from macro '__kmalloc_token' 506 | #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) | ^ include/linux/instruction_pointer.h:10:41: note: expanded from macro '_THIS_IP_' 10 | #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) | ^ drivers/gpu/drm/msm/msm_gpu.c:304:16: note: jump enters a statement expression Apparently using _THIS_IP_ creates a possible indirect jump target, but because it's in a statement expression, it's invalid, so the compiler complains. This is obviously nonsense, because the actual indirect jump in this gpu driver code would never jump to the _THIS_IP_ __here label, but that's what it is. Given this pre-existing issue, we probably need to continue using _RET_IP_, as before. I tried to fix _THIS_IP_, but it's incredibly brittle (e.g. __always_inline function returning address of label doesn't work on Clang, but would on GCC). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-06 13:03 ` Marco Elver @ 2026-05-07 9:38 ` Nathan Chancellor 2026-05-07 21:49 ` Harry Yoo (Oracle) 1 sibling, 0 replies; 12+ messages in thread From: Nathan Chancellor @ 2026-05-07 9:38 UTC (permalink / raw) To: Marco Elver Cc: Vlastimil Babka (SUSE), Andrew Morton, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Harry Yoo, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Wed, May 06, 2026 at 03:03:27PM +0200, Marco Elver wrote: > Bah, this is why it doesn't work: > > >> drivers/gpu/drm/msm/msm_gpu.c:272:4: error: cannot jump from this indirect goto statement to one of its possible targets > 272 | drm_exec_retry_on_contention(&exec); > | ^ > include/drm/drm_exec.h:123:4: note: expanded from macro > 'drm_exec_retry_on_contention' > 123 | goto *__drm_exec_retry_ptr; \ > | ^ > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: possible target of > indirect goto statement > 304 | state->bos = kcalloc(submit->nr_bos, > | ^ > include/linux/slab.h:1173:34: note: expanded from macro 'kcalloc' > 1173 | #define kcalloc(n, size, flags) kmalloc_array(n, > size, (flags) | __GFP_ZERO) > | ^ > include/linux/slab.h:1133:42: note: expanded from macro 'kmalloc_array' > 1133 | #define kmalloc_array(...) > alloc_hooks(kmalloc_array_noprof(__VA_ARGS__)) > | ^ > include/linux/slab.h:1132:71: note: expanded from macro > 'kmalloc_array_noprof' > 1132 | #define kmalloc_array_noprof(...) > _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > | > ^ > include/linux/slab.h:506:55: note: expanded from macro '__kmalloc_token' > 506 | #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > | ^ > include/linux/instruction_pointer.h:10:41: note: expanded from > macro '_THIS_IP_' > 10 | #define _THIS_IP_ ({ __label__ __here; __here: (unsigned > long)&&__here; }) > | ^ > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: jump enters a statement > expression > > > Apparently using _THIS_IP_ creates a possible indirect jump target, > but because it's in a statement expression, it's invalid, so the > compiler complains. This is obviously nonsense, because the actual > indirect jump in this gpu driver code would never jump to the > _THIS_IP_ __here label, but that's what it is. > > Given this pre-existing issue, we probably need to continue using > _RET_IP_, as before. I tried to fix _THIS_IP_, but it's incredibly > brittle (e.g. __always_inline function returning address of label > doesn't work on Clang, but would on GCC). For what it's worth, both LLVM and GCC consider the generic version of _THIS_IP_ to be broken (even if it works currently), as the address of a label is only expected to be used with a computed goto (hence the clang error above, it just looks for possible computed goto targets): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44298 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 https://github.com/llvm/llvm-project/issues/138272 -- Cheers, Nathan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-06 13:03 ` Marco Elver 2026-05-07 9:38 ` Nathan Chancellor @ 2026-05-07 21:49 ` Harry Yoo (Oracle) 2026-05-08 14:21 ` Marco Elver 1 sibling, 1 reply; 12+ messages in thread From: Harry Yoo (Oracle) @ 2026-05-07 21:49 UTC (permalink / raw) To: Marco Elver Cc: Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Wed, May 06, 2026 at 03:03:27PM +0100, Marco Elver wrote: > On Mon, 4 May 2026 at 23:23, Marco Elver <elver@google.com> wrote: > > On Thu, Apr 30, 2026 at 03:03PM +0200, Vlastimil Babka (SUSE) wrote: > > > On 4/24/26 15:24, Marco Elver wrote: > > > > @@ -948,14 +978,16 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f > > > > > > > > index = kmalloc_index(size); > > > > return __kmalloc_cache_noprof( > > > > - kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], > > > > + kmalloc_caches[kmalloc_type(flags, token)][index], > > > > > > While reviewing this, it occured to me we might have been using _RET_IP_ > > > here in a suboptimal way ever since this was introduced. Since this is all > > > inlined, shouldn't have we been using _THIS_IP_ to really randomize using > > > the kmalloc() callsite, and not its parent? > > > > > > And after this patch, we get the token passed to _kmalloc_noprof()... > > > > > > > flags, size); > > > > } > > > > - return __kmalloc_noprof(size, flags); > > > > + return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); > > > > > > ... and used also here for the non-constant-size, where previously > > > __kmalloc_noprof() (not inline function) would correctly use _RET_IP_ on its > > > own ... > > > > > > > } > > > > +#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > > > > > > ... and the token comes from here. With random partitioning that's > > > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > > > > > so that AFAIK makes the situation worse as now the cases without constant > > > size also start randomizing by the parent callsite and not the kmalloc callsite. > > > > > > But there are many users of __kmalloc_token() and maybe some are corrent in > > > using _RET_IP_, I haven't checked, maybe we'll need two variants, or further > > > change things around. > > > > Good catch. I don't think we need multiple variants (otherwise the TYPED > > variant would be broken) - we're moving token generation to the callers > > (not even inlined anymore) with all this macro magic. > > > > I think this is all we need: > > > > --- a/include/linux/slab.h > > +++ b/include/linux/slab.h > > @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); > > typedef struct { unsigned long v; } kmalloc_token_t; > > #ifdef CONFIG_KMALLOC_PARTITION_RANDOM > > extern unsigned long random_kmalloc_seed; > > -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > > #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) > > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) > > #endif > > > > Plus a paragraph in the commit message. Let me add that. Err, I was like "yes, this is the way to go!" and then... > Bah, this is why it doesn't work: > > >> drivers/gpu/drm/msm/msm_gpu.c:272:4: error: cannot jump from this indirect goto statement to one of its possible targets > 272 | drm_exec_retry_on_contention(&exec); > | ^ > include/drm/drm_exec.h:123:4: note: expanded from macro > 'drm_exec_retry_on_contention' > 123 | goto *__drm_exec_retry_ptr; \ > | ^ > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: possible target of > indirect goto statement > 304 | state->bos = kcalloc(submit->nr_bos, > | ^ > include/linux/slab.h:1173:34: note: expanded from macro 'kcalloc' > 1173 | #define kcalloc(n, size, flags) kmalloc_array(n, > size, (flags) | __GFP_ZERO) > | ^ > include/linux/slab.h:1133:42: note: expanded from macro 'kmalloc_array' > 1133 | #define kmalloc_array(...) > alloc_hooks(kmalloc_array_noprof(__VA_ARGS__)) > | ^ > include/linux/slab.h:1132:71: note: expanded from macro > 'kmalloc_array_noprof' > 1132 | #define kmalloc_array_noprof(...) > _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > | > ^ > include/linux/slab.h:506:55: note: expanded from macro '__kmalloc_token' > 506 | #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > | ^ > include/linux/instruction_pointer.h:10:41: note: expanded from > macro '_THIS_IP_' > 10 | #define _THIS_IP_ ({ __label__ __here; __here: (unsigned > long)&&__here; }) > | ^ > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: jump enters a statement > expression > > Apparently using _THIS_IP_ creates a possible indirect jump target, Didn't even realize people use indirect gotos, heh :) > but because it's in a statement expression, it's invalid, so the > compiler complains. This is obviously nonsense, because the actual > indirect jump in this gpu driver code would never jump to the > _THIS_IP_ __here label, but that's what it is. Yeah, I guess it's quite tricky to handle when you don't know where it'd jump to as it's an indirect one, and there's an invalid jump label... > Given this pre-existing issue, we probably need to continue using > _RET_IP_, as before. Agreed! -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-07 21:49 ` Harry Yoo (Oracle) @ 2026-05-08 14:21 ` Marco Elver 2026-05-11 8:31 ` Harry Yoo (Oracle) 0 siblings, 1 reply; 12+ messages in thread From: Marco Elver @ 2026-05-08 14:21 UTC (permalink / raw) To: Harry Yoo (Oracle) Cc: Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Thu, May 07, 2026 at 11:49PM +0200, 'Harry Yoo (Oracle)' via kasan-dev wrote: > On Wed, May 06, 2026 at 03:03:27PM +0100, Marco Elver wrote: > > On Mon, 4 May 2026 at 23:23, Marco Elver <elver@google.com> wrote: > > > On Thu, Apr 30, 2026 at 03:03PM +0200, Vlastimil Babka (SUSE) wrote: > > > > On 4/24/26 15:24, Marco Elver wrote: > > > > > @@ -948,14 +978,16 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f > > > > > > > > > > index = kmalloc_index(size); > > > > > return __kmalloc_cache_noprof( > > > > > - kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], > > > > > + kmalloc_caches[kmalloc_type(flags, token)][index], > > > > > > > > While reviewing this, it occured to me we might have been using _RET_IP_ > > > > here in a suboptimal way ever since this was introduced. Since this is all > > > > inlined, shouldn't have we been using _THIS_IP_ to really randomize using > > > > the kmalloc() callsite, and not its parent? > > > > > > > > And after this patch, we get the token passed to _kmalloc_noprof()... > > > > > > > > > flags, size); > > > > > } > > > > > - return __kmalloc_noprof(size, flags); > > > > > + return __kmalloc_noprof(PASS_KMALLOC_PARAMS(size, NULL, token), flags); > > > > > > > > ... and used also here for the non-constant-size, where previously > > > > __kmalloc_noprof() (not inline function) would correctly use _RET_IP_ on its > > > > own ... > > > > > > > > > } > > > > > +#define kmalloc_noprof(...) _kmalloc_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > > > > > > > > ... and the token comes from here. With random partitioning that's > > > > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > > > > > > > so that AFAIK makes the situation worse as now the cases without constant > > > > size also start randomizing by the parent callsite and not the kmalloc callsite. > > > > > > > > But there are many users of __kmalloc_token() and maybe some are corrent in > > > > using _RET_IP_, I haven't checked, maybe we'll need two variants, or further > > > > change things around. > > > > > > Good catch. I don't think we need multiple variants (otherwise the TYPED > > > variant would be broken) - we're moving token generation to the callers > > > (not even inlined anymore) with all this macro magic. > > > > > > I think this is all we need: > > > > > > --- a/include/linux/slab.h > > > +++ b/include/linux/slab.h > > > @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); > > > typedef struct { unsigned long v; } kmalloc_token_t; > > > #ifdef CONFIG_KMALLOC_PARTITION_RANDOM > > > extern unsigned long random_kmalloc_seed; > > > -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > > > +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > > > #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) > > > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) > > > #endif > > > > > > Plus a paragraph in the commit message. Let me add that. > > Err, I was like "yes, this is the way to go!" > > and then... > > > Bah, this is why it doesn't work: > > > > >> drivers/gpu/drm/msm/msm_gpu.c:272:4: error: cannot jump from this indirect goto statement to one of its possible targets > > 272 | drm_exec_retry_on_contention(&exec); > > | ^ > > include/drm/drm_exec.h:123:4: note: expanded from macro > > 'drm_exec_retry_on_contention' > > 123 | goto *__drm_exec_retry_ptr; \ > > | ^ > > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: possible target of > > indirect goto statement > > 304 | state->bos = kcalloc(submit->nr_bos, > > | ^ > > include/linux/slab.h:1173:34: note: expanded from macro 'kcalloc' > > 1173 | #define kcalloc(n, size, flags) kmalloc_array(n, > > size, (flags) | __GFP_ZERO) > > | ^ > > include/linux/slab.h:1133:42: note: expanded from macro 'kmalloc_array' > > 1133 | #define kmalloc_array(...) > > alloc_hooks(kmalloc_array_noprof(__VA_ARGS__)) > > | ^ > > include/linux/slab.h:1132:71: note: expanded from macro > > 'kmalloc_array_noprof' > > 1132 | #define kmalloc_array_noprof(...) > > _kmalloc_array_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__)) > > | > > ^ > > include/linux/slab.h:506:55: note: expanded from macro '__kmalloc_token' > > 506 | #define __kmalloc_token(...) ((kmalloc_token_t){ .v = _THIS_IP_ }) > > | ^ > > include/linux/instruction_pointer.h:10:41: note: expanded from > > macro '_THIS_IP_' > > 10 | #define _THIS_IP_ ({ __label__ __here; __here: (unsigned > > long)&&__here; }) > > | ^ > > drivers/gpu/drm/msm/msm_gpu.c:304:16: note: jump enters a statement > > expression > > > > Apparently using _THIS_IP_ creates a possible indirect jump target, > > Didn't even realize people use indirect gotos, heh :) > > > but because it's in a statement expression, it's invalid, so the > > compiler complains. This is obviously nonsense, because the actual > > indirect jump in this gpu driver code would never jump to the > > _THIS_IP_ __here label, but that's what it is. > > Yeah, I guess it's quite tricky to handle when you don't know where > it'd jump to as it's an indirect one, and there's an invalid jump > label... > > > Given this pre-existing issue, we probably need to continue using > > _RET_IP_, as before. I think I have a solution for this mess, see below. I would not send it as 1 series, but only include the slab changes (+ instruction_pointer.h change to introduce _CODE_LOCATION_) as one series, to go through the slab tree. The rest of the patches would go to respective arch maintainers. ------ >8 ------ diff --git a/arch/alpha/include/asm/linkage.h b/arch/alpha/include/asm/linkage.h index aa8661fa60dc..88617cfaa0f7 100644 --- a/arch/alpha/include/asm/linkage.h +++ b/arch/alpha/include/asm/linkage.h @@ -6,4 +6,6 @@ #define SYSCALL_ALIAS(alias, name) \ asm ( #alias " = " #name "\n\t.globl " #alias) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("br %0, 1f\n1:" : "=r" (__ip)); __ip; }) + #endif diff --git a/arch/arc/include/asm/linkage.h b/arch/arc/include/asm/linkage.h index ba3cb65b5eaa..3fb91d1672ba 100644 --- a/arch/arc/include/asm/linkage.h +++ b/arch/arc/include/asm/linkage.h @@ -75,6 +75,8 @@ #define __arcfp_data __section(".data") #endif +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("mov %0, pcl" : "=r" (__ip)); __ip; }) + #endif /* __ASSEMBLER__ */ #endif diff --git a/arch/arm/include/asm/linkage.h b/arch/arm/include/asm/linkage.h index c4670694ada7..416e6a242dc4 100644 --- a/arch/arm/include/asm/linkage.h +++ b/arch/arm/include/asm/linkage.h @@ -9,4 +9,6 @@ .type name, %function; \ END(name) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("1: adr %0, 1b" : "=r" (__ip)); __ip; }) + #endif diff --git a/arch/arm64/include/asm/linkage.h b/arch/arm64/include/asm/linkage.h index 40bd17add539..73eabc82a6bb 100644 --- a/arch/arm64/include/asm/linkage.h +++ b/arch/arm64/include/asm/linkage.h @@ -43,4 +43,6 @@ SYM_TYPED_START(name, SYM_L_GLOBAL, SYM_A_ALIGN) \ bti c ; +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("adr %0, ." : "=r" (__ip)); __ip; }) + #endif diff --git a/arch/csky/include/asm/linkage.h b/arch/csky/include/asm/linkage.h new file mode 100644 index 000000000000..04afd3583e25 --- /dev/null +++ b/arch/csky/include/asm/linkage.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_CSKY_LINKAGE_H +#define __ASM_CSKY_LINKAGE_H + +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("grs %0, ." : "=r" (__ip)); __ip; }) + +#endif /* __ASM_CSKY_LINKAGE_H */ diff --git a/arch/hexagon/include/asm/linkage.h b/arch/hexagon/include/asm/linkage.h index ebdb581939e8..b3808f093e62 100644 --- a/arch/hexagon/include/asm/linkage.h +++ b/arch/hexagon/include/asm/linkage.h @@ -9,4 +9,6 @@ #define __ALIGN .align 4 #define __ALIGN_STR ".align 4" +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("call 1f\n1: %0 = r31" : "=r" (__ip) : : "r31"); __ip; }) + #endif diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h index a1bd6a3ee03a..f175b25068d7 100644 --- a/arch/loongarch/include/asm/linkage.h +++ b/arch/loongarch/include/asm/linkage.h @@ -77,4 +77,6 @@ #define SYM_SIGFUNC_END(name) SYM_FUNC_END(name) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("pcaddi %0, 0" : "=r" (__ip)); __ip; }) + #endif diff --git a/arch/m68k/include/asm/linkage.h b/arch/m68k/include/asm/linkage.h index c8b84282764c..9ed2f36830d0 100644 --- a/arch/m68k/include/asm/linkage.h +++ b/arch/m68k/include/asm/linkage.h @@ -35,4 +35,6 @@ __asmlinkage_protect_n(ret, "m" (arg1), "m" (arg2), "m" (arg3), \ "m" (arg4), "m" (arg5), "m" (arg6)) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("lea %%pc@(.), %0" : "=a" (__ip)); __ip; }) + #endif diff --git a/arch/microblaze/include/asm/linkage.h b/arch/microblaze/include/asm/linkage.h new file mode 100644 index 000000000000..fc3873e0e9b6 --- /dev/null +++ b/arch/microblaze/include/asm/linkage.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_MICROBLAZE_LINKAGE_H +#define _ASM_MICROBLAZE_LINKAGE_H + +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("mfs %0, rpc" : "=r" (__ip)); __ip; }) + +#endif /* _ASM_MICROBLAZE_LINKAGE_H */ diff --git a/arch/mips/include/asm/linkage.h b/arch/mips/include/asm/linkage.h index fd44ba754f1a..0579eac57def 100644 --- a/arch/mips/include/asm/linkage.h +++ b/arch/mips/include/asm/linkage.h @@ -10,4 +10,14 @@ #define SYSCALL_ALIAS(alias, name) \ asm ( #alias " = " #name "\n\t.globl " #alias) +#define _THIS_IP_ ({ \ + unsigned long __ip; \ + asm volatile("bal 1f\n\t" \ + " nop\n\t" \ + "1: move %0, $ra" \ + : "=r" (__ip) : : "$ra" \ + ); \ + __ip; \ +}) + #endif diff --git a/arch/nios2/include/asm/linkage.h b/arch/nios2/include/asm/linkage.h index 211302301a8a..c4073235852b 100644 --- a/arch/nios2/include/asm/linkage.h +++ b/arch/nios2/include/asm/linkage.h @@ -12,4 +12,6 @@ #define __ALIGN .align 4 #define __ALIGN_STR ".align 4" +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("nextpc %0" : "=r" (__ip)); __ip; }) + #endif diff --git a/arch/openrisc/include/asm/linkage.h b/arch/openrisc/include/asm/linkage.h index 25aa449ac30e..a96e808b5d1a 100644 --- a/arch/openrisc/include/asm/linkage.h +++ b/arch/openrisc/include/asm/linkage.h @@ -18,4 +18,14 @@ #define __ALIGN .align 0 #define __ALIGN_STR ".align 0" +#define _THIS_IP_ ({ \ + unsigned long __ip; \ + asm volatile("l.jal 1f\n\t" \ + " l.nop\n\t" \ + "1: l.ori %0, r9, 0" \ + : "=r" (__ip) : : "r9" \ + ); \ + __ip; \ +}) + #endif /* __ASM_OPENRISC_LINKAGE_H */ diff --git a/arch/parisc/include/asm/linkage.h b/arch/parisc/include/asm/linkage.h index d4cad492b971..d4d8ff7735c7 100644 --- a/arch/parisc/include/asm/linkage.h +++ b/arch/parisc/include/asm/linkage.h @@ -37,4 +37,12 @@ name: ASM_NL\ #endif /* __ASSEMBLER__ */ +#define _THIS_IP_ ({ \ + unsigned long __ip; \ + asm volatile("b,l 1f, %0\n\t" \ + " nop\n\t" \ + "1:" : "=r" (__ip)); \ + __ip; \ +}) + #endif /* __ASM_PARISC_LINKAGE_H */ diff --git a/arch/powerpc/include/asm/linkage.h b/arch/powerpc/include/asm/linkage.h index b71b9582e754..aa469e7bef0b 100644 --- a/arch/powerpc/include/asm/linkage.h +++ b/arch/powerpc/include/asm/linkage.h @@ -13,4 +13,13 @@ "\t.globl ." #alias "\n\t.set ." #alias ", ." #name) #endif +#define _THIS_IP_ ({ \ + unsigned long __ip; \ + asm volatile("bcl 20,31,1f\n\t" \ + "1: mflr %0" \ + : "=r" (__ip) : : "lr" \ + ); \ + __ip; \ +}) + #endif /* _ASM_POWERPC_LINKAGE_H */ diff --git a/arch/riscv/include/asm/linkage.h b/arch/riscv/include/asm/linkage.h index 9e88ba23cd2b..7e0210ef4eb4 100644 --- a/arch/riscv/include/asm/linkage.h +++ b/arch/riscv/include/asm/linkage.h @@ -9,4 +9,6 @@ #define __ALIGN .balign 4 #define __ALIGN_STR ".balign 4" +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("auipc %0, 0" : "=r" (__ip)); __ip; }) + #endif /* _ASM_RISCV_LINKAGE_H */ diff --git a/arch/s390/include/asm/linkage.h b/arch/s390/include/asm/linkage.h index df3fb7d8227b..1b3ac553a642 100644 --- a/arch/s390/include/asm/linkage.h +++ b/arch/s390/include/asm/linkage.h @@ -7,4 +7,6 @@ #define __ALIGN .balign CONFIG_FUNCTION_ALIGNMENT, 0x07 #define __ALIGN_STR __stringify(__ALIGN) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("larl %0, ." : "=d" (__ip)); __ip; }) + #endif diff --git a/arch/sh/include/asm/linkage.h b/arch/sh/include/asm/linkage.h index 7c2fa27a43f8..af56b38b6001 100644 --- a/arch/sh/include/asm/linkage.h +++ b/arch/sh/include/asm/linkage.h @@ -5,4 +5,6 @@ #define __ALIGN .balign 4 #define __ALIGN_STR ".balign 4" +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("mova 1f, %0\n1:" : "=z" (__ip)); __ip; }) + #endif diff --git a/arch/sparc/include/asm/linkage.h b/arch/sparc/include/asm/linkage.h new file mode 100644 index 000000000000..3f24e2da88be --- /dev/null +++ b/arch/sparc/include/asm/linkage.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_SPARC_LINKAGE_H +#define _ASM_SPARC_LINKAGE_H + +#ifdef CONFIG_SPARC64 +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("rd %%pc, %0" : "=r" (__ip)); __ip; }) +#else +#define _THIS_IP_ ({ \ + unsigned long __ip; \ + asm volatile("call 1f\n\t" \ + " nop\n\t" \ + "1: mov %%o7, %0" \ + : "=r" (__ip) : : "o7" \ + ); \ + __ip; \ +}) +#endif + +#endif /* _ASM_SPARC_LINKAGE_H */ diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h index a7294656ad90..bce3c6f4b94f 100644 --- a/arch/x86/include/asm/linkage.h +++ b/arch/x86/include/asm/linkage.h @@ -13,11 +13,12 @@ * The generic version tends to create spurious ENDBR instructions under * certain conditions. */ -#define _THIS_IP_ ({ unsigned long __here; asm ("lea 0(%%rip), %0" : "=r" (__here)); __here; }) +#define _THIS_IP_ ({ unsigned long __here; asm volatile("lea 0(%%rip), %0" : "=r" (__here)); __here; }) #endif #ifdef CONFIG_X86_32 #define asmlinkage CPP_ASMLINKAGE __attribute__((regparm(0))) +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("call 1f\n1: pop %0" : "=r" (__ip)); __ip; }) #endif /* CONFIG_X86_32 */ #define __ALIGN .balign CONFIG_FUNCTION_ALIGNMENT, 0x90; diff --git a/arch/xtensa/include/asm/linkage.h b/arch/xtensa/include/asm/linkage.h index 0ba9973235d9..9e6f5cc81964 100644 --- a/arch/xtensa/include/asm/linkage.h +++ b/arch/xtensa/include/asm/linkage.h @@ -6,4 +6,6 @@ #define __ALIGN .align 4 #define __ALIGN_STR ".align 4" +#define _THIS_IP_ ({ unsigned long __ip; asm volatile("call0 1f\n1: mov %0, a0" : "=r" (__ip) : : "a0"); __ip; }) + #endif diff --git a/include/linux/instruction_pointer.h b/include/linux/instruction_pointer.h index aa0b3ffea935..dfe73aafddb8 100644 --- a/include/linux/instruction_pointer.h +++ b/include/linux/instruction_pointer.h @@ -8,6 +8,30 @@ #ifndef _THIS_IP_ #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) +/* + * The current generic definition of _THIS_IP_ is considered broken by GCC [1] + * and Clang [2]. In particular, the address of a label is only expected to be + * used with a computed goto. + * + * [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 + * [2] https://github.com/llvm/llvm-project/issues/138272 + * + * Mark it as broken, so that appropriate fallback options can be implemented + * for architectures that do not define their won _THIS_IP_. + */ +#define HAS_BROKEN_THIS_IP +#endif + +/* + * _CODE_LOCATION_ provides a unique identifier for the current code location. + * When _THIS_IP_ is broken (generic version), we fall back to a static marker + * which guarantees uniqueness and resolves to a constant address at link time, + * avoiding runtime overhead and compiler optimizations breaking it. + */ +#ifdef HAS_BROKEN_THIS_IP +#define _CODE_LOCATION_ ({ static const char __here; (unsigned long)&__here; }) +#else +#define _CODE_LOCATION_ _THIS_IP_ #endif #endif /* _LINUX_INSTRUCTION_POINTER_H */ diff --git a/include/linux/slab.h b/include/linux/slab.h index 5e1249e36b0d..a4bf1585411f 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); typedef struct { unsigned long v; } kmalloc_token_t; #ifdef CONFIG_KMALLOC_PARTITION_RANDOM extern unsigned long random_kmalloc_seed; -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _CODE_LOCATION_ }) #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) #endif ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-08 14:21 ` Marco Elver @ 2026-05-11 8:31 ` Harry Yoo (Oracle) 2026-05-11 9:34 ` Marco Elver 0 siblings, 1 reply; 12+ messages in thread From: Harry Yoo (Oracle) @ 2026-05-11 8:31 UTC (permalink / raw) To: Marco Elver Cc: Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Fri, May 08, 2026 at 04:21:36PM +0200, Marco Elver wrote: > I think I have a solution for this mess, see below. > > I would not send it as 1 series, but only include the slab changes (+ > instruction_pointer.h change to introduce _CODE_LOCATION_) as one > series, to go through the slab tree. The rest of the patches would go to > respective arch maintainers. I'm assuming this will be a follow-up and reviewing patch 1 (and waiting for Jon's thuoghts on patch 2) > diff --git a/include/linux/instruction_pointer.h b/include/linux/instruction_pointer.h > index aa0b3ffea935..dfe73aafddb8 100644 > --- a/include/linux/instruction_pointer.h > +++ b/include/linux/instruction_pointer.h > @@ -8,6 +8,30 @@ > > #ifndef _THIS_IP_ > #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) > +/* > + * The current generic definition of _THIS_IP_ is considered broken by GCC [1] > + * and Clang [2]. In particular, the address of a label is only expected to be > + * used with a computed goto. > + * > + * [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 > + * [2] https://github.com/llvm/llvm-project/issues/138272 > + * > + * Mark it as broken, so that appropriate fallback options can be implemented > + * for architectures that do not define their won _THIS_IP_. > + */ > +#define HAS_BROKEN_THIS_IP > +#endif As long as _THIS_IP_ is broken on some arches, it cannot be used anyway when in a general API that can be used by arbitrary users? Is it something that can be fixed in all arches over time? > +/* > + * _CODE_LOCATION_ provides a unique identifier for the current code location. > + * When _THIS_IP_ is broken (generic version), we fall back to a static marker > + * which guarantees uniqueness and resolves to a constant address at link time, > + * avoiding runtime overhead and compiler optimizations breaking it. > + */ > +#ifdef HAS_BROKEN_THIS_IP > +#define _CODE_LOCATION_ ({ static const char __here; (unsigned long)&__here; }) Nice! Yes, we don't really need the exact code location for partitioning kmalloc caches. IIRC lockdep does a similar thing to define lock classes (unique for each lock init location) > +#else > +#define _CODE_LOCATION_ _THIS_IP_ > #endif Probably we don't need this fallback? > #endif /* _LINUX_INSTRUCTION_POINTER_H */ > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 5e1249e36b0d..a4bf1585411f 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s); > typedef struct { unsigned long v; } kmalloc_token_t; > #ifdef CONFIG_KMALLOC_PARTITION_RANDOM > extern unsigned long random_kmalloc_seed; > -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ }) > +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _CODE_LOCATION_ }) > #elif defined(CONFIG_KMALLOC_PARTITION_TYPED) > #define __kmalloc_token(...) ((kmalloc_token_t){ .v = __builtin_infer_alloc_token(__VA_ARGS__) }) > #endif -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-11 8:31 ` Harry Yoo (Oracle) @ 2026-05-11 9:34 ` Marco Elver 2026-05-11 18:14 ` Kees Cook 0 siblings, 1 reply; 12+ messages in thread From: Marco Elver @ 2026-05-11 9:34 UTC (permalink / raw) To: Harry Yoo (Oracle) Cc: Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Hao Li, David Rientjes, Roman Gushchin, Kees Cook, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Mon, 11 May 2026 at 10:31, 'Harry Yoo (Oracle)' via kasan-dev <kasan-dev@googlegroups.com> wrote: > > On Fri, May 08, 2026 at 04:21:36PM +0200, Marco Elver wrote: > > I think I have a solution for this mess, see below. > > > > I would not send it as 1 series, but only include the slab changes (+ > > instruction_pointer.h change to introduce _CODE_LOCATION_) as one > > series, to go through the slab tree. The rest of the patches would go to > > respective arch maintainers. > > I'm assuming this will be a follow-up and reviewing patch 1 > (and waiting for Jon's thuoghts on patch 2) I'll be sending v4 shortly. > > diff --git a/include/linux/instruction_pointer.h b/include/linux/instruction_pointer.h > > index aa0b3ffea935..dfe73aafddb8 100644 > > --- a/include/linux/instruction_pointer.h > > +++ b/include/linux/instruction_pointer.h > > @@ -8,6 +8,30 @@ > > > > #ifndef _THIS_IP_ > > #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) > > +/* > > + * The current generic definition of _THIS_IP_ is considered broken by GCC [1] > > + * and Clang [2]. In particular, the address of a label is only expected to be > > + * used with a computed goto. > > + * > > + * [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 > > + * [2] https://github.com/llvm/llvm-project/issues/138272 > > + * > > + * Mark it as broken, so that appropriate fallback options can be implemented > > + * for architectures that do not define their won _THIS_IP_. > > + */ > > +#define HAS_BROKEN_THIS_IP > > +#endif > > As long as _THIS_IP_ is broken on some arches, it cannot be used anyway > when in a general API that can be used by arbitrary users? It more or less works today, and for debugging or tracing it's "good enough" in most cases. The plan would be to phase out the generic _THIS_IP_ once all architectures have a correct _THIS_IP_ implementation. > Is it something that can be fixed in all arches over time? Yes, I have patches for that. > > +/* > > + * _CODE_LOCATION_ provides a unique identifier for the current code location. > > + * When _THIS_IP_ is broken (generic version), we fall back to a static marker > > + * which guarantees uniqueness and resolves to a constant address at link time, > > + * avoiding runtime overhead and compiler optimizations breaking it. > > + */ > > +#ifdef HAS_BROKEN_THIS_IP > > +#define _CODE_LOCATION_ ({ static const char __here; (unsigned long)&__here; }) > > Nice! > > Yes, we don't really need the exact code location > for partitioning kmalloc caches. > > IIRC lockdep does a similar thing to define lock classes (unique for > each lock init location) > > > +#else > > +#define _CODE_LOCATION_ _THIS_IP_ > > #endif > > Probably we don't need this fallback? x86-64 is the only arch that has working fast _THIS_IP_, and adding static __here markers to bss is rather wasteful. More architectures will be supporting _THIS_IP_ properly once I get to send the patches. The mainstream architectures all have a reasonable and fast way to get the current IP, so we don't need to waste bss space there. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning 2026-05-11 9:34 ` Marco Elver @ 2026-05-11 18:14 ` Kees Cook 0 siblings, 0 replies; 12+ messages in thread From: Kees Cook @ 2026-05-11 18:14 UTC (permalink / raw) To: Marco Elver Cc: Harry Yoo (Oracle), Vlastimil Babka (SUSE), Andrew Morton, Nathan Chancellor, Nicolas Schier, Dennis Zhou, Tejun Heo, Christoph Lameter, Hao Li, David Rientjes, Roman Gushchin, Gustavo A. R. Silva, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Alexander Potapenko, Dmitry Vyukov, Nick Desaulniers, Bill Wendling, Justin Stitt, Miguel Ojeda, linux-kbuild, linux-kernel, linux-mm, linux-hardening, kasan-dev, llvm, Andrey Konovalov, Florent Revest, Jann Horn, KP Singh, Matteo Rizzo, GONG Ruiqi On Mon, May 11, 2026 at 11:34:53AM +0200, Marco Elver wrote: > On Mon, 11 May 2026 at 10:31, 'Harry Yoo (Oracle)' via kasan-dev > <kasan-dev@googlegroups.com> wrote: > > > > On Fri, May 08, 2026 at 04:21:36PM +0200, Marco Elver wrote: > > > I think I have a solution for this mess, see below. > > > > > > I would not send it as 1 series, but only include the slab changes (+ > > > instruction_pointer.h change to introduce _CODE_LOCATION_) as one > > > series, to go through the slab tree. The rest of the patches would go to > > > respective arch maintainers. > > > > I'm assuming this will be a follow-up and reviewing patch 1 > > (and waiting for Jon's thuoghts on patch 2) > > I'll be sending v4 shortly. > > > > diff --git a/include/linux/instruction_pointer.h b/include/linux/instruction_pointer.h > > > index aa0b3ffea935..dfe73aafddb8 100644 > > > --- a/include/linux/instruction_pointer.h > > > +++ b/include/linux/instruction_pointer.h > > > @@ -8,6 +8,30 @@ > > > > > > #ifndef _THIS_IP_ > > > #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) > > > +/* > > > + * The current generic definition of _THIS_IP_ is considered broken by GCC [1] > > > + * and Clang [2]. In particular, the address of a label is only expected to be > > > + * used with a computed goto. > > > + * > > > + * [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 > > > + * [2] https://github.com/llvm/llvm-project/issues/138272 > > > + * > > > + * Mark it as broken, so that appropriate fallback options can be implemented > > > + * for architectures that do not define their won _THIS_IP_. Teeny typo above: "won" -> "own". > > > + */ > > > +#define HAS_BROKEN_THIS_IP > > > +#endif > > > > As long as _THIS_IP_ is broken on some arches, it cannot be used anyway > > when in a general API that can be used by arbitrary users? > > It more or less works today, and for debugging or tracing it's "good > enough" in most cases. > > The plan would be to phase out the generic _THIS_IP_ once all > architectures have a correct _THIS_IP_ implementation. > > > Is it something that can be fixed in all arches over time? > > Yes, I have patches for that. > > > > +/* > > > + * _CODE_LOCATION_ provides a unique identifier for the current code location. > > > + * When _THIS_IP_ is broken (generic version), we fall back to a static marker > > > + * which guarantees uniqueness and resolves to a constant address at link time, > > > + * avoiding runtime overhead and compiler optimizations breaking it. > > > + */ > > > +#ifdef HAS_BROKEN_THIS_IP > > > +#define _CODE_LOCATION_ ({ static const char __here; (unsigned long)&__here; }) > > > > Nice! > > > > Yes, we don't really need the exact code location > > for partitioning kmalloc caches. > > > > IIRC lockdep does a similar thing to define lock classes (unique for > > each lock init location) > > > > > +#else > > > +#define _CODE_LOCATION_ _THIS_IP_ > > > #endif > > > > Probably we don't need this fallback? > > x86-64 is the only arch that has working fast _THIS_IP_, and adding > static __here markers to bss is rather wasteful. > > More architectures will be supporting _THIS_IP_ properly once I get to > send the patches. The mainstream architectures all have a reasonable > and fast way to get the current IP, so we don't need to waste bss > space there. Thanks for finding a solution for this! -- Kees Cook ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-11 18:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260424132427.2703076-1-elver@google.com>
[not found] ` <20260424132427.2703076-2-elver@google.com>
[not found] ` <9c321184-9080-4d5c-bd1a-a16cd0bbaed3@kernel.org>
[not found] ` <CANpmjNN_=g31Eoa+w1NrFALfp1dDBi5oHEZdr_bA_48-tS2M=Q@mail.gmail.com>
2026-05-04 15:00 ` [PATCH v3 2/2] slab: fix kernel-docs for mm-api Marco Elver
2026-05-11 12:07 ` Vlastimil Babka (SUSE)
2026-05-11 12:19 ` Jonathan Corbet
2026-05-11 16:34 ` Marco Elver
[not found] ` <6f2bd63a-dc02-4631-a3a5-7ec8e58a4a4e@kernel.org>
2026-05-04 21:22 ` [PATCH v3 1/2] slab: support for compiler-assisted type-based slab cache partitioning Marco Elver
2026-05-06 13:03 ` Marco Elver
2026-05-07 9:38 ` Nathan Chancellor
2026-05-07 21:49 ` Harry Yoo (Oracle)
2026-05-08 14:21 ` Marco Elver
2026-05-11 8:31 ` Harry Yoo (Oracle)
2026-05-11 9:34 ` Marco Elver
2026-05-11 18:14 ` Kees Cook
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox