* [PATCH v2 01/16] mm/slab: do not limit zeroing to orig_size when only red zoning is enabled
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 02/16] mm/slab: do not init any kfence objects on allocation Vlastimil Babka (SUSE)
` (14 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE), stable
When init (zeroing) on allocation is requested, for kmalloc() we
generally have to zero the full object size even if a smaller size is
requested, in order to provide krealloc()'s __GFP_ZERO guarantees.
But if we track the requested size, krealloc() uses that information to
do the right thing. With red zoning also enabled, any unused size
became part of the red zone, so it must not be zeroed.
However the check is imprecise, and will trigger also when only
SLAB_RED_ZONE is enabled without SLAB_STORE_USER. This means enabling
red zoning alone can compromise krealloc()'s __GFP_ZERO contract.
Fix this by using slub_debug_orig_size() instead, which is the exact
check for whether the requested size is tracked. We don't need to care
if red zoning is also enabled or not. Also update and expand the
comment accordingly.
Fixes: 9ce67395f5a0 ("mm/slub: only zero requested size of buffer for kzalloc when debug enabled")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 63c1ef998dd3..e2ee8f1aaccf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4574,15 +4574,17 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
gfp_t init_flags = flags & gfp_allowed_mask;
/*
- * For kmalloc object, the allocated memory size(object_size) is likely
- * larger than the requested size(orig_size). If redzone check is
- * enabled for the extra space, don't zero it, as it will be redzoned
- * soon. The redzone operation for this extra space could be seen as a
- * replacement of current poisoning under certain debug option, and
- * won't break other sanity checks.
+ * For kmalloc object, the allocated size (object_size) can be larger
+ * than the requested size (orig_size). We however need to zero the
+ * whole object_size to handle possible later krealloc() with
+ *__GFP_ZERO properly.
+ *
+ * But if we keep track of the requested size, krealloc() uses that
+ * information. Additionally if red zoning is enabled, the extra space
+ * is also red zone, so we should not overwrite it. So limit zeroing to
+ * orig_size if we track it.
*/
- if (kmem_cache_debug_flags(s, SLAB_STORE_USER | SLAB_RED_ZONE) &&
- (s->flags & SLAB_KMALLOC))
+ if (slub_debug_orig_size(s))
zero_size = orig_size;
/*
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 02/16] mm/slab: do not init any kfence objects on allocation
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 01/16] mm/slab: do not limit zeroing to orig_size when only red zoning is enabled Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 03/16] mm/slab: stop inlining __slab_alloc_node() Vlastimil Babka (SUSE)
` (13 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
When init (zeroing) on allocation is requested, for kmalloc() we
generally have to zero the full object size even if a smaller size is
requested, in order to provide krealloc()'s __GFP_ZERO guarantees.
When we end up allocating a kfence object, kfence perfoms the zeroing on
its own because has its own redzone beyond the requested size. Thus
slab_post_alloc_hook() has an 'init' parameter which has to be evaluated
in all callers (via slab_want_init_on_alloc()) and should be false for
kfence allocations.
For kfence allocations in slab_alloc_node() this is achieved by subtly
skipping over the slab_want_init_on_alloc() call. Other callers (i.e.
kmem_cache_alloc_bulk_noprof()) however evaluate it unconditionally even
if they do end up with a kfence allocation. This is only subtly not a
problem, as those are not kmalloc allocations and thus the "requested
size" equals s->object_size and thus it cannot interfere with kfence's
redzone. There's just a unnecessary double zeroing (in both kfence and
slab_post_alloc_hook()), but it's all very fragile and contradicts the
comment in kfence_guarded_alloc().
Remove this subtlety and simplify the code by eliminating the init
parameter from slab_post_alloc_hook() and make it call
slab_want_init_on_alloc() itself. Instead add a is_kfence_address()
check before performing the memset, which will start doing the right
thing for all callers of slab_post_alloc_hook().
This potentially adds overhead of the is_kfence_address() check to
allocation hotpath, but that one is designed to be as small as possible,
and it's only evaluated if zeroing is about to happen. This means (aside
from init_on_alloc hardening) only for __GFP_ZERO allocations, and the
zeroing itself comes with an overhead likely larger than the added
check.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/kfence/core.c | 2 +-
mm/slub.c | 23 ++++++++---------------
2 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 655dc5ce3240..5e0b406924e9 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -500,7 +500,7 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
/*
* We check slab_want_init_on_alloc() ourselves, rather than letting
- * SL*B do the initialization, as otherwise we might overwrite KFENCE's
+ * slab do the initialization, as otherwise it might overwrite KFENCE's
* redzone.
*/
if (unlikely(slab_want_init_on_alloc(gfp, cache)))
diff --git a/mm/slub.c b/mm/slub.c
index e2ee8f1aaccf..8e5264d3ddbf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4565,9 +4565,10 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
static __fastpath_inline
bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
- gfp_t flags, size_t size, void **p, bool init,
+ gfp_t flags, size_t size, void **p,
unsigned int orig_size)
{
+ bool init = slab_want_init_on_alloc(flags, s);
unsigned int zero_size = s->object_size;
bool kasan_init = init;
size_t i;
@@ -4608,7 +4609,8 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
for (i = 0; i < size; i++) {
p[i] = kasan_slab_alloc(s, p[i], init_flags, kasan_init);
if (p[i] && init && (!kasan_init ||
- !kasan_has_integrated_init()))
+ !kasan_has_integrated_init())
+ && !is_kfence_address(p[i]))
memset(p[i], 0, zero_size);
if (gfpflags_allow_spinning(flags))
kmemleak_alloc_recursive(p[i], s->object_size, 1,
@@ -4910,7 +4912,6 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
gfp_t gfpflags, int node, unsigned long addr, size_t orig_size)
{
void *object;
- bool init = false;
s = slab_pre_alloc_hook(s, gfpflags);
if (unlikely(!s))
@@ -4926,16 +4927,13 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
object = __slab_alloc_node(s, gfpflags, node, addr, orig_size);
maybe_wipe_obj_freeptr(s, object);
- init = slab_want_init_on_alloc(gfpflags, s);
out:
/*
- * When init equals 'true', like for kzalloc() family, only
- * @orig_size bytes might be zeroed instead of s->object_size
* In case this fails due to memcg_slab_post_alloc_hook(),
* object is set to NULL
*/
- slab_post_alloc_hook(s, lru, gfpflags, 1, &object, init, orig_size);
+ slab_post_alloc_hook(s, lru, gfpflags, 1, &object, orig_size);
return object;
}
@@ -5230,7 +5228,6 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
struct slab_sheaf *sheaf)
{
void *ret = NULL;
- bool init;
if (sheaf->size == 0)
goto out;
@@ -5240,10 +5237,8 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
if (likely(!ret))
ret = sheaf->objects[--sheaf->size];
- init = slab_want_init_on_alloc(gfp, s);
-
/* add __GFP_NOFAIL to force successful memcg charging */
- slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, init, s->object_size);
+ slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, s->object_size);
out:
trace_kmem_cache_alloc(_RET_IP_, ret, s, gfp, NUMA_NO_NODE);
@@ -5423,8 +5418,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
success:
maybe_wipe_obj_freeptr(s, ret);
- slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret,
- slab_want_init_on_alloc(alloc_gfp, s), orig_size);
+ slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret, orig_size);
ret = kasan_kmalloc(s, ret, orig_size, alloc_gfp);
return ret;
@@ -7339,8 +7333,7 @@ bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
out:
/* memcg and kmem_cache debug support and memory initialization */
- return likely(slab_post_alloc_hook(s, NULL, flags, size, p,
- slab_want_init_on_alloc(flags, s), s->object_size));
+ return likely(slab_post_alloc_hook(s, NULL, flags, size, p, s->object_size));
}
EXPORT_SYMBOL(kmem_cache_alloc_bulk_noprof);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 03/16] mm/slab: stop inlining __slab_alloc_node()
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 01/16] mm/slab: do not limit zeroing to orig_size when only red zoning is enabled Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 02/16] mm/slab: do not init any kfence objects on allocation Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 04/16] mm/slab: introduce slab_alloc_context Vlastimil Babka (SUSE)
` (12 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
With sheaves, this is no longer part of the allocation fastpath. For
the same reason, also mark the call to it from slab_alloc_node() as
unlikely().
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 8e5264d3ddbf..7b48c0d38404 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4519,8 +4519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
return object;
}
-static __always_inline void *__slab_alloc_node(struct kmem_cache *s,
- gfp_t gfpflags, int node, unsigned long addr, size_t orig_size)
+static void *__slab_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node,
+ unsigned long addr, size_t orig_size)
{
void *object;
@@ -4923,7 +4923,7 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
object = alloc_from_pcs(s, gfpflags, node);
- if (!object)
+ if (unlikely(!object))
object = __slab_alloc_node(s, gfpflags, node, addr, orig_size);
maybe_wipe_obj_freeptr(s, object);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 04/16] mm/slab: introduce slab_alloc_context
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (2 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 03/16] mm/slab: stop inlining __slab_alloc_node() Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 05/16] mm/slab: introduce alloc_flags and SLAB_ALLOC_TRYLOCK Vlastimil Babka (SUSE)
` (11 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Similarly to page allocator's struct alloc_context, introduce a helper
struct to hold a part of the allocation arguments. This will allow
reducing the number of parameters in many functions of the
implementation, and extend them easily if needed.
For now, make it hold the caller address and the originally requested
allocation size.
Convert alloc_single_from_new_slab(), __slab_alloc_node() and
___slab_alloc(). No functional change intended.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 46 +++++++++++++++++++++++++++++++++-------------
1 file changed, 33 insertions(+), 13 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 7b48c0d38404..a3cac7281cc6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -213,6 +213,12 @@ DEFINE_STATIC_KEY_FALSE(slub_debug_enabled);
static DEFINE_STATIC_KEY_FALSE(strict_numa);
#endif
+/* Structure holding extra parameters for slab allocations */
+struct slab_alloc_context {
+ unsigned long caller_addr;
+ unsigned long orig_size;
+};
+
/* Structure holding parameters for get_from_partial() call chain */
struct partial_context {
gfp_t flags;
@@ -3687,7 +3693,8 @@ static inline void init_slab_obj_iter(struct kmem_cache *s, struct slab *slab,
* and put the slab to the partial (or full) list.
*/
static void *alloc_single_from_new_slab(struct kmem_cache *s, struct slab *slab,
- int orig_size, bool allow_spin)
+ struct slab_alloc_context *ac,
+ bool allow_spin)
{
struct kmem_cache_node *n;
struct slab_obj_iter iter;
@@ -3705,7 +3712,7 @@ static void *alloc_single_from_new_slab(struct kmem_cache *s, struct slab *slab,
/* alloc_debug_processing() always expects a valid freepointer */
set_freepointer(s, object, slab->freelist);
- if (!alloc_debug_processing(s, slab, object, orig_size)) {
+ if (!alloc_debug_processing(s, slab, object, ac->orig_size)) {
/*
* It's not really expected that this would fail on a
* freshly allocated slab, but a concurrent memory
@@ -4443,7 +4450,7 @@ static unsigned int alloc_from_new_slab(struct kmem_cache *s, struct slab *slab,
* slab.
*/
static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
- unsigned long addr, unsigned int orig_size)
+ struct slab_alloc_context *ac)
{
bool allow_spin = gfpflags_allow_spinning(gfpflags);
void *object;
@@ -4476,7 +4483,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
pc.flags = GFP_NOWAIT | __GFP_THISNODE;
}
- pc.orig_size = orig_size;
+ pc.orig_size = ac->orig_size;
object = get_from_partial(s, node, &pc);
if (object)
goto success;
@@ -4496,7 +4503,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
stat(s, ALLOC_SLAB);
if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
- object = alloc_single_from_new_slab(s, slab, orig_size, allow_spin);
+ object = alloc_single_from_new_slab(s, slab, ac, allow_spin);
if (likely(object))
goto success;
@@ -4514,13 +4521,13 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
success:
if (kmem_cache_debug_flags(s, SLAB_STORE_USER))
- set_track(s, object, TRACK_ALLOC, addr, gfpflags);
+ set_track(s, object, TRACK_ALLOC, ac->caller_addr, gfpflags);
return object;
}
static void *__slab_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node,
- unsigned long addr, size_t orig_size)
+ struct slab_alloc_context *ac)
{
void *object;
@@ -4545,7 +4552,7 @@ static void *__slab_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node,
}
#endif
- object = ___slab_alloc(s, gfpflags, node, addr, orig_size);
+ object = ___slab_alloc(s, gfpflags, node, ac);
return object;
}
@@ -4923,8 +4930,13 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
object = alloc_from_pcs(s, gfpflags, node);
- if (unlikely(!object))
- object = __slab_alloc_node(s, gfpflags, node, addr, orig_size);
+ if (unlikely(!object)) {
+ struct slab_alloc_context ac = {
+ .caller_addr = addr,
+ .orig_size = orig_size,
+ };
+ object = __slab_alloc_node(s, gfpflags, node, &ac);
+ }
maybe_wipe_obj_freeptr(s, object);
@@ -5389,13 +5401,18 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
if (ret)
goto success;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = orig_size,
+ };
+
/*
* Do not call slab_alloc_node(), since trylock mode isn't
* compatible with slab_pre_alloc_hook/should_failslab and
* kfence_alloc. Hence call __slab_alloc_node() (at most twice)
* and slab_post_alloc_hook() directly.
*/
- ret = __slab_alloc_node(s, alloc_gfp, node, _RET_IP_, orig_size);
+ ret = __slab_alloc_node(s, alloc_gfp, node, &ac);
/*
* It's possible we failed due to trylock as we preempted someone with
@@ -7237,10 +7254,13 @@ static bool __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
int i;
if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = s->object_size,
+ };
for (i = 0; i < size; i++) {
- p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE, _RET_IP_,
- s->object_size);
+ p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE, &ac);
if (unlikely(!p[i]))
goto error;
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 05/16] mm/slab: introduce alloc_flags and SLAB_ALLOC_TRYLOCK
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (3 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 04/16] mm/slab: introduce slab_alloc_context Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 06/16] mm/slab: add alloc_flags to slab_alloc_context Vlastimil Babka (SUSE)
` (10 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Similarly to the page allocators, introduce slab-allocator specific
alloc flags that internally control allocation behavior in addition to
gfp_flags, without occupying the limited gfp flags space.
Introduce the first flag SLAB_ALLOC_TRYLOCK that behaves similarly to
page allocator's ALLOC_TRYLOCK and will be used to reimplement
kmalloc_nolock()'s "!allow_spin" behavior. That currently relies on
gfpflags_allow_spinning() and thus the lack of both __GFP_RECLAIM flags,
importantly __GFP_KSWAPD_RECLAIM. This can give false-positive results
e.g. in early boot with a restricted gfp_allowed_mask.
Also introduce alloc_flags_allow_spinning() to replace the usage of
gfpflags_allow_spinning().
Start using alloc_flags and the new check first in alloc_from_pcs() and
__pcs_replace_empty_main(). This means some slab allocations that were
falsely treated as kmalloc_nolock() due to their gfp flags will now have
higher chances of succeed, and this will further increase with followup
changes.
Remove a WARN_ON_ONCE() from refill_objects() as it's now legitimate to
reach it from a slab allocation that's not _nolock() and yet lacks
__GFP_KSWAPD_RECLAIM for other reasons.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slab.h | 9 +++++++++
mm/slub.c | 17 ++++++++---------
2 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index 1bf9c3021ae3..96f65b625600 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -16,6 +16,15 @@
* Internal slab definitions
*/
+/* slab's alloc_flags definitions */
+#define SLAB_ALLOC_DEFAULT 0x00 /* no flags */
+#define SLAB_ALLOC_TRYLOCK 0x01 /* a kmalloc_nolock() allocation */
+
+static inline bool alloc_flags_allow_spinning(const unsigned int alloc_flags)
+{
+ return !(alloc_flags & SLAB_ALLOC_TRYLOCK);
+}
+
#ifdef CONFIG_64BIT
# ifdef system_has_cmpxchg128
# define system_has_freelist_aba() system_has_cmpxchg128()
diff --git a/mm/slub.c b/mm/slub.c
index a3cac7281cc6..e79fbca11bc0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4638,7 +4638,8 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
* unlocked.
*/
static struct slub_percpu_sheaves *
-__pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs, gfp_t gfp)
+__pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
+ gfp_t gfp, unsigned int alloc_flags)
{
struct slab_sheaf *empty = NULL;
struct slab_sheaf *full;
@@ -4664,7 +4665,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
return NULL;
}
- allow_spin = gfpflags_allow_spinning(gfp);
+ allow_spin = alloc_flags_allow_spinning(alloc_flags);
full = barn_replace_empty_sheaf(barn, pcs->main, allow_spin);
@@ -4750,7 +4751,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
}
static __fastpath_inline
-void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, int node)
+void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, unsigned int alloc_flags, int node)
{
struct slub_percpu_sheaves *pcs;
bool node_requested;
@@ -4795,7 +4796,7 @@ void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, int node)
pcs = this_cpu_ptr(s->cpu_sheaves);
if (unlikely(pcs->main->size == 0)) {
- pcs = __pcs_replace_empty_main(s, pcs, gfp);
+ pcs = __pcs_replace_empty_main(s, pcs, gfp, alloc_flags);
if (unlikely(!pcs))
return NULL;
}
@@ -4928,7 +4929,7 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
if (unlikely(object))
goto out;
- object = alloc_from_pcs(s, gfpflags, node);
+ object = alloc_from_pcs(s, gfpflags, SLAB_ALLOC_DEFAULT, node);
if (unlikely(!object)) {
struct slab_alloc_context ac = {
@@ -5359,6 +5360,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
{
gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
size_t orig_size = size;
+ unsigned int alloc_flags = SLAB_ALLOC_TRYLOCK;
struct kmem_cache *s;
bool can_retry = true;
void *ret;
@@ -5397,7 +5399,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
*/
return NULL;
- ret = alloc_from_pcs(s, alloc_gfp, node);
+ ret = alloc_from_pcs(s, alloc_gfp, alloc_flags, node);
if (ret)
goto success;
@@ -7216,9 +7218,6 @@ refill_objects(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min,
unsigned int refilled;
struct slab *slab;
- if (WARN_ON_ONCE(!gfpflags_allow_spinning(gfp)))
- return 0;
-
refilled = __refill_objects_node(s, p, gfp, min, max,
get_node(s, local_node),
/* allow_spin = */ true);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 06/16] mm/slab: add alloc_flags to slab_alloc_context
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (4 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 05/16] mm/slab: introduce alloc_flags and SLAB_ALLOC_TRYLOCK Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 07/16] mm/slab: replace struct partial_context with slab_alloc_context Vlastimil Babka (SUSE)
` (9 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Add alloc_flags as a new field to the slab_alloc_context helper struct,
so we can pass it to more functions in the slab implementation without
adding another function parameter.
Start checking them via alloc_flags_allow_spinning() in
alloc_single_from_new_slab() (where we can drop the allow_spin
parameter) and ___slab_alloc(). This further reduces false-positive
spinning-not-allowed from allocations that are not kmalloc_nolock() but
lack __GFP_RECLAIM flags.
_kmalloc_nolock_noprof() initializes ac.alloc_flags using its flags that
are SLAB_ALLOC_TRYLOCK. slab_alloc_node() and __kmem_cache_alloc_bulk()
are not reachable from kmalloc_nolock() and all their callers expect
spinning to be allowed, so they can use SLAB_ALLOC_DEFAULT. This is
temporary as the scope of slab_alloc_context will further move to the
callers, making the alloc_flags usage more obvious.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index e79fbca11bc0..ef745b37d063 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -217,6 +217,7 @@ static DEFINE_STATIC_KEY_FALSE(strict_numa);
struct slab_alloc_context {
unsigned long caller_addr;
unsigned long orig_size;
+ unsigned int alloc_flags;
};
/* Structure holding parameters for get_from_partial() call chain */
@@ -3693,9 +3694,9 @@ static inline void init_slab_obj_iter(struct kmem_cache *s, struct slab *slab,
* and put the slab to the partial (or full) list.
*/
static void *alloc_single_from_new_slab(struct kmem_cache *s, struct slab *slab,
- struct slab_alloc_context *ac,
- bool allow_spin)
+ struct slab_alloc_context *ac)
{
+ bool allow_spin = alloc_flags_allow_spinning(ac->alloc_flags);
struct kmem_cache_node *n;
struct slab_obj_iter iter;
bool needs_add_partial;
@@ -4452,7 +4453,7 @@ static unsigned int alloc_from_new_slab(struct kmem_cache *s, struct slab *slab,
static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
struct slab_alloc_context *ac)
{
- bool allow_spin = gfpflags_allow_spinning(gfpflags);
+ bool allow_spin = alloc_flags_allow_spinning(ac->alloc_flags);
void *object;
struct slab *slab;
struct partial_context pc;
@@ -4503,7 +4504,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
stat(s, ALLOC_SLAB);
if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
- object = alloc_single_from_new_slab(s, slab, ac, allow_spin);
+ object = alloc_single_from_new_slab(s, slab, ac);
if (likely(object))
goto success;
@@ -4919,6 +4920,7 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size,
static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list_lru *lru,
gfp_t gfpflags, int node, unsigned long addr, size_t orig_size)
{
+ const unsigned int alloc_flags = SLAB_ALLOC_DEFAULT;
void *object;
s = slab_pre_alloc_hook(s, gfpflags);
@@ -4929,12 +4931,13 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
if (unlikely(object))
goto out;
- object = alloc_from_pcs(s, gfpflags, SLAB_ALLOC_DEFAULT, node);
+ object = alloc_from_pcs(s, gfpflags, alloc_flags, node);
if (unlikely(!object)) {
struct slab_alloc_context ac = {
.caller_addr = addr,
.orig_size = orig_size,
+ .alloc_flags = alloc_flags,
};
object = __slab_alloc_node(s, gfpflags, node, &ac);
}
@@ -5406,6 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
struct slab_alloc_context ac = {
.caller_addr = _RET_IP_,
.orig_size = orig_size,
+ .alloc_flags = alloc_flags,
};
/*
@@ -7256,6 +7260,7 @@ static bool __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
struct slab_alloc_context ac = {
.caller_addr = _RET_IP_,
.orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
};
for (i = 0; i < size; i++) {
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 07/16] mm/slab: replace struct partial_context with slab_alloc_context
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (5 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 06/16] mm/slab: add alloc_flags to slab_alloc_context Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 08/16] mm/slab: pass alloc_flags to new slab allocation Vlastimil Babka (SUSE)
` (8 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Refactor get_from_partial_node(), get_from_any_partial(),
get_from_partial() and ___slab_alloc().
Remove struct partial_context, which used to be more substantial but
shrank as part of the sheaves conversion. Instead pass gfp_flags and
pointer to the new slab_alloc_context, which together is a superset of
partial_context.
This means alloc_flags are now available and we can use them to
determine if spinning is allowed, further reducing false positive "not
allowed" in the slow path due to gfp flags lacking __GFP_RECLAIM.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 52 ++++++++++++++++++++++++----------------------------
1 file changed, 24 insertions(+), 28 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index ef745b37d063..98b79e5e7679 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -220,12 +220,6 @@ struct slab_alloc_context {
unsigned int alloc_flags;
};
-/* Structure holding parameters for get_from_partial() call chain */
-struct partial_context {
- gfp_t flags;
- unsigned int orig_size;
-};
-
/* Structure holding parameters for get_partial_node_bulk() */
struct partial_bulk_context {
gfp_t flags;
@@ -3826,7 +3820,8 @@ static bool get_partial_node_bulk(struct kmem_cache *s,
*/
static void *get_from_partial_node(struct kmem_cache *s,
struct kmem_cache_node *n,
- struct partial_context *pc)
+ gfp_t gfp_flags,
+ struct slab_alloc_context *ac)
{
struct slab *slab, *slab2;
unsigned long flags;
@@ -3841,7 +3836,7 @@ static void *get_from_partial_node(struct kmem_cache *s,
if (!n || !n->nr_partial)
return NULL;
- if (gfpflags_allow_spinning(pc->flags))
+ if (alloc_flags_allow_spinning(ac->alloc_flags))
spin_lock_irqsave(&n->list_lock, flags);
else if (!spin_trylock_irqsave(&n->list_lock, flags))
return NULL;
@@ -3849,12 +3844,12 @@ static void *get_from_partial_node(struct kmem_cache *s,
struct freelist_counters old, new;
- if (!pfmemalloc_match(slab, pc->flags))
+ if (!pfmemalloc_match(slab, gfp_flags))
continue;
if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
object = alloc_single_from_partial(s, n, slab,
- pc->orig_size);
+ ac->orig_size);
if (object)
break;
continue;
@@ -3888,15 +3883,16 @@ static void *get_from_partial_node(struct kmem_cache *s,
/*
* Get an object from somewhere. Search in increasing NUMA distances.
*/
-static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *pc)
+static void *get_from_any_partial(struct kmem_cache *s, gfp_t gfp_flags,
+ struct slab_alloc_context *ac)
{
#ifdef CONFIG_NUMA
struct zonelist *zonelist;
struct zoneref *z;
struct zone *zone;
- enum zone_type highest_zoneidx = gfp_zone(pc->flags);
+ enum zone_type highest_zoneidx = gfp_zone(gfp_flags);
unsigned int cpuset_mems_cookie;
- bool allow_spin = gfpflags_allow_spinning(pc->flags);
+ bool allow_spin = alloc_flags_allow_spinning(ac->alloc_flags);
/*
* The defrag ratio allows a configuration of the tradeoffs between
@@ -3930,16 +3926,17 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
if (allow_spin)
cpuset_mems_cookie = read_mems_allowed_begin();
- zonelist = node_zonelist(mempolicy_slab_node(), pc->flags);
+ zonelist = node_zonelist(mempolicy_slab_node(), gfp_flags);
for_each_zone_zonelist(zone, z, zonelist, highest_zoneidx) {
struct kmem_cache_node *n;
n = get_node(s, zone_to_nid(zone));
- if (n && cpuset_zone_allowed(zone, pc->flags) &&
+ if (n && cpuset_zone_allowed(zone, gfp_flags) &&
n->nr_partial > s->min_partial) {
- void *object = get_from_partial_node(s, n, pc);
+ void *object = get_from_partial_node(s, n,
+ gfp_flags, ac);
if (object) {
/*
@@ -3961,8 +3958,8 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
/*
* Get an object from a partial slab
*/
-static void *get_from_partial(struct kmem_cache *s, int node,
- struct partial_context *pc)
+static void *get_from_partial(struct kmem_cache *s, int node, gfp_t flags,
+ struct slab_alloc_context *ac)
{
int searchnode = node;
void *object;
@@ -3970,11 +3967,11 @@ static void *get_from_partial(struct kmem_cache *s, int node,
if (node == NUMA_NO_NODE)
searchnode = numa_mem_id();
- object = get_from_partial_node(s, get_node(s, searchnode), pc);
- if (object || (node != NUMA_NO_NODE && (pc->flags & __GFP_THISNODE)))
+ object = get_from_partial_node(s, get_node(s, searchnode), flags, ac);
+ if (object || (node != NUMA_NO_NODE && (flags & __GFP_THISNODE)))
return object;
- return get_from_any_partial(s, pc);
+ return get_from_any_partial(s, flags, ac);
}
static bool has_pcs_used(int cpu, struct kmem_cache *s)
@@ -4454,16 +4451,16 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
struct slab_alloc_context *ac)
{
bool allow_spin = alloc_flags_allow_spinning(ac->alloc_flags);
+ gfp_t trynode_flags;
void *object;
struct slab *slab;
- struct partial_context pc;
bool try_thisnode = true;
stat(s, ALLOC_SLOWPATH);
new_objects:
- pc.flags = gfpflags;
+ trynode_flags = gfpflags;
/*
* When a preferred node is indicated but no __GFP_THISNODE
*
@@ -4479,17 +4476,16 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
&& try_thisnode)) {
if (unlikely(!allow_spin))
/* Do not upgrade gfp to NOWAIT from more restrictive mode */
- pc.flags = gfpflags | __GFP_THISNODE;
+ trynode_flags = gfpflags | __GFP_THISNODE;
else
- pc.flags = GFP_NOWAIT | __GFP_THISNODE;
+ trynode_flags = GFP_NOWAIT | __GFP_THISNODE;
}
- pc.orig_size = ac->orig_size;
- object = get_from_partial(s, node, &pc);
+ object = get_from_partial(s, node, trynode_flags, ac);
if (object)
goto success;
- slab = new_slab(s, pc.flags, node);
+ slab = new_slab(s, trynode_flags, node);
if (unlikely(!slab)) {
if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE)
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 08/16] mm/slab: pass alloc_flags to new slab allocation
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (6 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 07/16] mm/slab: replace struct partial_context with slab_alloc_context Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 09/16] mm/slab: pass alloc_flags through slab_post_alloc_hook() chain Vlastimil Babka (SUSE)
` (7 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Add the alloc_flags parameter to allocate_slab() and new_slab()
so it can be used to determine if spinning is allowed, independently
from gfp flags.
refill_objects() passes SLAB_ALLOC_DEFAULT because it can only be
reached from contexts that allow spinning.
Also change how trynode_flags are constructed in ___slab_alloc() to
achieve the same "do not upgrade to GFP_NOWAIT" by using masking instead
of a branch. It will now also not upgrade in cases where gfp is weaker
than GFP_NOWAIT (i.e. lacks __GFP_KSWAPD_RECLAIM) but doesn't come from
kmalloc_nolock() - which is more correct anyway.
During the masking keep also existing __GFP_NOMEMALLOC (pointed out by
Sashiko) and __GFP_ACCOUNT. Previously the hardcoded GFP_NOWAIT would
eliminate them, but it's not a big problem that would need a separate
fix.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 98b79e5e7679..8f6ca3d5fdfa 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3378,9 +3378,10 @@ static __always_inline void unaccount_slab(struct slab *slab, int order,
}
/* Allocate and initialize a slab without building its freelist. */
-static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
+static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags,
+ unsigned int alloc_flags, int node)
{
- bool allow_spin = gfpflags_allow_spinning(flags);
+ bool allow_spin = alloc_flags_allow_spinning(alloc_flags);
struct slab *slab;
struct kmem_cache_order_objects oo = s->oo;
gfp_t alloc_gfp;
@@ -3438,15 +3439,17 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
return slab;
}
-static struct slab *new_slab(struct kmem_cache *s, gfp_t flags, int node)
+static struct slab *new_slab(struct kmem_cache *s, gfp_t flags,
+ unsigned int alloc_flags, int node)
{
if (unlikely(flags & GFP_SLAB_BUG_MASK))
flags = kmalloc_fix_flags(flags);
WARN_ON_ONCE(s->ctor && (flags & __GFP_ZERO));
- return allocate_slab(s,
- flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
+ flags &= GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK;
+
+ return allocate_slab(s, flags, alloc_flags, node);
}
static void __free_slab(struct kmem_cache *s, struct slab *slab, bool allow_spin)
@@ -4467,25 +4470,22 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
* 1) try to get a partial slab from target node only by having
* __GFP_THISNODE in pc.flags for get_from_partial()
* 2) if 1) failed, try to allocate a new slab from target node with
- * GPF_NOWAIT | __GFP_THISNODE opportunistically
+ * (at most) GFP_NOWAIT | __GFP_THISNODE opportunistically
* 3) if 2) failed, retry with original gfpflags which will allow
* get_from_partial() try partial lists of other nodes before
* potentially allocating new page from other nodes
*/
if (unlikely(node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE)
&& try_thisnode)) {
- if (unlikely(!allow_spin))
- /* Do not upgrade gfp to NOWAIT from more restrictive mode */
- trynode_flags = gfpflags | __GFP_THISNODE;
- else
- trynode_flags = GFP_NOWAIT | __GFP_THISNODE;
+ trynode_flags &= GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_ACCOUNT;
+ trynode_flags |= __GFP_NOWARN | __GFP_THISNODE;
}
object = get_from_partial(s, node, trynode_flags, ac);
if (object)
goto success;
- slab = new_slab(s, trynode_flags, node);
+ slab = new_slab(s, trynode_flags, ac->alloc_flags, node);
if (unlikely(!slab)) {
if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE)
@@ -7231,7 +7231,7 @@ refill_objects(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min,
new_slab:
- slab = new_slab(s, gfp, local_node);
+ slab = new_slab(s, gfp, SLAB_ALLOC_DEFAULT, local_node);
if (!slab)
goto out;
@@ -7579,7 +7579,7 @@ static void early_kmem_cache_node_alloc(int node)
BUG_ON(kmem_cache_node->size < sizeof(struct kmem_cache_node));
- slab = new_slab(kmem_cache_node, GFP_NOWAIT, node);
+ slab = new_slab(kmem_cache_node, GFP_NOWAIT, SLAB_ALLOC_DEFAULT, node);
BUG_ON(!slab);
if (slab_nid(slab) != node) {
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 09/16] mm/slab: pass alloc_flags through slab_post_alloc_hook() chain
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (7 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 08/16] mm/slab: pass alloc_flags to new slab allocation Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 10/16] mm/slab: replace slab_alloc_node() parameters with slab_alloc_context Vlastimil Babka (SUSE)
` (6 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Convert the whole following call stack to pass either slab_alloc_context
(thus including alloc_flags) or just alloc_flags as necessary:
slab_post_alloc_hook()
alloc_tagging_slab_alloc_hook()
__alloc_tagging_slab_alloc_hook()
prepare_slab_obj_exts_hook()
alloc_slab_obj_exts()
memcg_slab_post_alloc_hook()
__memcg_slab_post_alloc_hook()
alloc_slab_obj_exts()
Converting all these at once avoids unnecessary churn and is mostly
mechanical.
This ultimately allows to decide if spinning is allowed using
alloc_flags in alloc_slab_obj_exts(), as well as slab_post_alloc_hook().
Aside from alloc_from_pcs_bulk() (to be handled next) there is nothing
else in slab itself relying on gfpflags_allow_spinning() which can
be false even if not called from kmalloc_nolock().
A followup change will also use the alloc_flags availability in the call
stack above to remove the __GFP_NO_OBJ_EXT flag.
For alloc_slab_obj_exts(), also replace the suboptimal "bool new_slab"
parameter with a SLAB_ALLOC_NEW_SLAB flag with identical functionality.
To further reduce the number of parameters of slab_post_alloc_hook(),
also make 'struct list_lru *lru' (which is NULL for most callers) a new
field of slab_alloc_context.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/memcontrol.c | 5 +--
mm/slab.h | 6 ++--
mm/slub.c | 94 +++++++++++++++++++++++++++++++++------------------------
3 files changed, 62 insertions(+), 43 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c03d4787d466..29390ba13baa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3424,7 +3424,8 @@ static inline size_t obj_full_size(struct kmem_cache *s)
}
bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
- gfp_t flags, size_t size, void **p)
+ gfp_t flags, unsigned int slab_alloc_flags,
+ size_t size, void **p)
{
size_t obj_size = obj_full_size(s);
struct obj_cgroup *objcg;
@@ -3472,7 +3473,7 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
slab = virt_to_slab(p[i]);
if (!slab_obj_exts(slab) &&
- alloc_slab_obj_exts(slab, s, flags, false)) {
+ alloc_slab_obj_exts(slab, s, flags, slab_alloc_flags)) {
continue;
}
diff --git a/mm/slab.h b/mm/slab.h
index 96f65b625600..4db6d8aa0ee3 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -19,6 +19,7 @@
/* slab's alloc_flags definitions */
#define SLAB_ALLOC_DEFAULT 0x00 /* no flags */
#define SLAB_ALLOC_TRYLOCK 0x01 /* a kmalloc_nolock() allocation */
+#define SLAB_ALLOC_NEW_SLAB 0x02 /* a flag for alloc_slab_obj_exts() */
static inline bool alloc_flags_allow_spinning(const unsigned int alloc_flags)
{
@@ -612,7 +613,7 @@ static inline struct slabobj_ext *slab_obj_ext(struct slab *slab,
}
int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
- gfp_t gfp, bool new_slab);
+ gfp_t gfp, unsigned int alloc_flags);
#else /* CONFIG_SLAB_OBJ_EXT */
@@ -642,7 +643,8 @@ static inline enum node_stat_item cache_vmstat_idx(struct kmem_cache *s)
#ifdef CONFIG_MEMCG
bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
- gfp_t flags, size_t size, void **p);
+ gfp_t flags, unsigned int slab_alloc_flags,
+ size_t size, void **p);
void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
void **p, int objects, unsigned long obj_exts);
#endif
diff --git a/mm/slub.c b/mm/slub.c
index 8f6ca3d5fdfa..e634137b67fa 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -218,6 +218,7 @@ struct slab_alloc_context {
unsigned long caller_addr;
unsigned long orig_size;
unsigned int alloc_flags;
+ struct list_lru *lru;
};
/* Structure holding parameters for get_partial_node_bulk() */
@@ -2155,9 +2156,9 @@ static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
}
int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
- gfp_t gfp, bool new_slab)
+ gfp_t gfp, unsigned int alloc_flags)
{
- bool allow_spin = gfpflags_allow_spinning(gfp);
+ const bool allow_spin = alloc_flags_allow_spinning(alloc_flags);
unsigned int objects = objs_per_slab(s, slab);
unsigned long new_exts;
unsigned long old_exts;
@@ -2206,7 +2207,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
- if (new_slab) {
+ if (alloc_flags & SLAB_ALLOC_NEW_SLAB) {
/*
* If the slab is brand new and nobody can yet access its
* obj_exts, no synchronization is required and obj_exts can
@@ -2331,7 +2332,7 @@ static inline void init_slab_obj_exts(struct slab *slab)
}
static int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
- gfp_t gfp, bool new_slab)
+ gfp_t gfp, unsigned int alloc_flags)
{
return 0;
}
@@ -2351,10 +2352,10 @@ static inline void alloc_slab_obj_exts_early(struct kmem_cache *s,
static inline unsigned long
prepare_slab_obj_exts_hook(struct kmem_cache *s, struct slab *slab,
- gfp_t flags, void *p)
+ gfp_t flags, unsigned int alloc_flags, void *p)
{
if (!slab_obj_exts(slab) &&
- alloc_slab_obj_exts(slab, s, flags, false)) {
+ alloc_slab_obj_exts(slab, s, flags, alloc_flags)) {
pr_warn_once("%s, %s: Failed to create slab extension vector!\n",
__func__, s->name);
return 0;
@@ -2366,7 +2367,8 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, struct slab *slab,
/* Should be called only if mem_alloc_profiling_enabled() */
static noinline void
-__alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
+__alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags,
+ unsigned int alloc_flags)
{
unsigned long obj_exts;
struct slabobj_ext *obj_ext;
@@ -2382,7 +2384,7 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
return;
slab = virt_to_slab(object);
- obj_exts = prepare_slab_obj_exts_hook(s, slab, flags, object);
+ obj_exts = prepare_slab_obj_exts_hook(s, slab, flags, alloc_flags, object);
/*
* Currently obj_exts is used only for allocation profiling.
* If other users appear then mem_alloc_profiling_enabled()
@@ -2401,10 +2403,11 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
}
static inline void
-alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
+alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags,
+ unsigned int alloc_flags)
{
if (mem_alloc_profiling_enabled())
- __alloc_tagging_slab_alloc_hook(s, object, flags);
+ __alloc_tagging_slab_alloc_hook(s, object, flags, alloc_flags);
}
/* Should be called only if mem_alloc_profiling_enabled() */
@@ -2443,7 +2446,8 @@ alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
#else /* CONFIG_MEM_ALLOC_PROFILING */
static inline void
-alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
+alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags,
+ unsigned int alloc_flags)
{
}
@@ -2461,8 +2465,9 @@ alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
static void memcg_alloc_abort_single(struct kmem_cache *s, void *object);
static __fastpath_inline
-bool memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
- gfp_t flags, size_t size, void **p)
+bool memcg_slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
+ size_t size, void **p,
+ struct slab_alloc_context *ac)
{
if (likely(!memcg_kmem_online()))
return true;
@@ -2470,7 +2475,8 @@ bool memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
if (likely(!(flags & __GFP_ACCOUNT) && !(s->flags & SLAB_ACCOUNT)))
return true;
- if (likely(__memcg_slab_post_alloc_hook(s, lru, flags, size, p)))
+ if (likely(__memcg_slab_post_alloc_hook(s, ac->lru, flags,
+ ac->alloc_flags, size, p)))
return true;
if (likely(size == 1)) {
@@ -2558,14 +2564,15 @@ bool memcg_slab_post_charge(void *p, gfp_t flags)
put_slab_obj_exts(obj_exts);
}
- return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p);
+ return __memcg_slab_post_alloc_hook(s, NULL, flags, SLAB_ALLOC_DEFAULT,
+ 1, &p);
}
#else /* CONFIG_MEMCG */
static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s,
- struct list_lru *lru,
- gfp_t flags, size_t size,
- void **p)
+ gfp_t flags,
+ size_t size, void **p,
+ struct slab_alloc_context *ac)
{
return true;
}
@@ -3352,12 +3359,14 @@ static inline void init_freelist_randomization(void) { }
#endif /* CONFIG_SLAB_FREELIST_RANDOM */
static __always_inline void account_slab(struct slab *slab, int order,
- struct kmem_cache *s, gfp_t gfp)
+ struct kmem_cache *s, gfp_t gfp,
+ unsigned int alloc_flags)
{
if (memcg_kmem_online() &&
(s->flags & SLAB_ACCOUNT) &&
!slab_obj_exts(slab))
- alloc_slab_obj_exts(slab, s, gfp, true);
+ alloc_slab_obj_exts(slab, s, gfp,
+ alloc_flags | SLAB_ALLOC_NEW_SLAB);
mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s),
PAGE_SIZE << order);
@@ -3434,7 +3443,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags,
* to prevent the array from being overwritten.
*/
alloc_slab_obj_exts_early(s, slab);
- account_slab(slab, oo_order(oo), s, flags);
+ account_slab(slab, oo_order(oo), s, flags, alloc_flags);
return slab;
}
@@ -4568,9 +4577,8 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
}
static __fastpath_inline
-bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
- gfp_t flags, size_t size, void **p,
- unsigned int orig_size)
+bool slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags, size_t size,
+ void **p, struct slab_alloc_context *ac)
{
bool init = slab_want_init_on_alloc(flags, s);
unsigned int zero_size = s->object_size;
@@ -4590,7 +4598,7 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
* orig_size if we track it.
*/
if (slub_debug_orig_size(s))
- zero_size = orig_size;
+ zero_size = ac->orig_size;
/*
* When slab_debug is enabled, avoid memory initialization integrated
@@ -4616,14 +4624,14 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
!kasan_has_integrated_init())
&& !is_kfence_address(p[i]))
memset(p[i], 0, zero_size);
- if (gfpflags_allow_spinning(flags))
+ if (alloc_flags_allow_spinning(ac->alloc_flags))
kmemleak_alloc_recursive(p[i], s->object_size, 1,
s->flags, init_flags);
kmsan_slab_alloc(s, p[i], init_flags);
- alloc_tagging_slab_alloc_hook(s, p[i], flags);
+ alloc_tagging_slab_alloc_hook(s, p[i], flags, ac->alloc_flags);
}
- return memcg_slab_post_alloc_hook(s, lru, flags, size, p);
+ return memcg_slab_post_alloc_hook(s, flags, size, p, ac);
}
/*
@@ -4918,6 +4926,12 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
{
const unsigned int alloc_flags = SLAB_ALLOC_DEFAULT;
void *object;
+ struct slab_alloc_context ac = {
+ .caller_addr = addr,
+ .orig_size = orig_size,
+ .alloc_flags = alloc_flags,
+ .lru = lru,
+ };
s = slab_pre_alloc_hook(s, gfpflags);
if (unlikely(!s))
@@ -4929,14 +4943,8 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
object = alloc_from_pcs(s, gfpflags, alloc_flags, node);
- if (unlikely(!object)) {
- struct slab_alloc_context ac = {
- .caller_addr = addr,
- .orig_size = orig_size,
- .alloc_flags = alloc_flags,
- };
+ if (!object)
object = __slab_alloc_node(s, gfpflags, node, &ac);
- }
maybe_wipe_obj_freeptr(s, object);
@@ -4945,7 +4953,7 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
* In case this fails due to memcg_slab_post_alloc_hook(),
* object is set to NULL
*/
- slab_post_alloc_hook(s, lru, gfpflags, 1, &object, orig_size);
+ slab_post_alloc_hook(s, gfpflags, 1, &object, &ac);
return object;
}
@@ -5240,6 +5248,10 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
struct slab_sheaf *sheaf)
{
void *ret = NULL;
+ struct slab_alloc_context ac = {
+ .orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
if (sheaf->size == 0)
goto out;
@@ -5250,7 +5262,7 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
ret = sheaf->objects[--sheaf->size];
/* add __GFP_NOFAIL to force successful memcg charging */
- slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, s->object_size);
+ slab_post_alloc_hook(s, gfp | __GFP_NOFAIL, 1, &ret, &ac);
out:
trace_kmem_cache_alloc(_RET_IP_, ret, s, gfp, NUMA_NO_NODE);
@@ -5437,7 +5449,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
success:
maybe_wipe_obj_freeptr(s, ret);
- slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret, orig_size);
+ slab_post_alloc_hook(s, alloc_gfp, 1, &ret, &ac);
ret = kasan_kmalloc(s, ret, orig_size, alloc_gfp);
return ret;
@@ -7303,6 +7315,10 @@ bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
{
unsigned int i = 0;
void *kfence_obj;
+ struct slab_alloc_context ac = {
+ .orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
if (!size)
return false;
@@ -7353,7 +7369,7 @@ bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
out:
/* memcg and kmem_cache debug support and memory initialization */
- return likely(slab_post_alloc_hook(s, NULL, flags, size, p, s->object_size));
+ return likely(slab_post_alloc_hook(s, flags, size, p, &ac));
}
EXPORT_SYMBOL(kmem_cache_alloc_bulk_noprof);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 10/16] mm/slab: replace slab_alloc_node() parameters with slab_alloc_context
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (8 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 09/16] mm/slab: pass alloc_flags through slab_post_alloc_hook() chain Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 11/16] mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags Vlastimil Babka (SUSE)
` (5 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
The function takes all the parameters that exist as fields in
slab_alloc_context, except alloc_flags. Replace them with a single
pointer.
This moves slab_alloc_context initialization to a number of callers,
which is more verbose, but arguably also more clear than a long list of
parameters, and most do not use the 'lru' field.
This will also allow kmalloc_nolock() to call slab_alloc_node() and
reduce the special open-coding it currently has.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 75 ++++++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 53 insertions(+), 22 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index e634137b67fa..0b9974bfcb24 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4921,30 +4921,23 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size,
*
* Otherwise we can simply pick the next object from the lockless free list.
*/
-static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list_lru *lru,
- gfp_t gfpflags, int node, unsigned long addr, size_t orig_size)
+static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s,
+ gfp_t gfpflags, int node, struct slab_alloc_context *ac)
{
- const unsigned int alloc_flags = SLAB_ALLOC_DEFAULT;
void *object;
- struct slab_alloc_context ac = {
- .caller_addr = addr,
- .orig_size = orig_size,
- .alloc_flags = alloc_flags,
- .lru = lru,
- };
s = slab_pre_alloc_hook(s, gfpflags);
if (unlikely(!s))
return NULL;
- object = kfence_alloc(s, orig_size, gfpflags);
+ object = kfence_alloc(s, ac->orig_size, gfpflags);
if (unlikely(object))
goto out;
- object = alloc_from_pcs(s, gfpflags, alloc_flags, node);
+ object = alloc_from_pcs(s, gfpflags, ac->alloc_flags, node);
if (!object)
- object = __slab_alloc_node(s, gfpflags, node, &ac);
+ object = __slab_alloc_node(s, gfpflags, node, ac);
maybe_wipe_obj_freeptr(s, object);
@@ -4953,15 +4946,21 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
* In case this fails due to memcg_slab_post_alloc_hook(),
* object is set to NULL
*/
- slab_post_alloc_hook(s, gfpflags, 1, &object, &ac);
+ slab_post_alloc_hook(s, gfpflags, 1, &object, ac);
return object;
}
void *kmem_cache_alloc_noprof(struct kmem_cache *s, gfp_t gfpflags)
{
- void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE, _RET_IP_,
- s->object_size);
+ void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
+ ret = slab_alloc_node(s, gfpflags, NUMA_NO_NODE, &ac);
trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, NUMA_NO_NODE);
@@ -4972,8 +4971,15 @@ EXPORT_SYMBOL(kmem_cache_alloc_noprof);
void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru,
gfp_t gfpflags)
{
- void *ret = slab_alloc_node(s, lru, gfpflags, NUMA_NO_NODE, _RET_IP_,
- s->object_size);
+ void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ .lru = lru,
+ };
+
+ ret = slab_alloc_node(s, gfpflags, NUMA_NO_NODE, &ac);
trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, NUMA_NO_NODE);
@@ -5005,7 +5011,14 @@ EXPORT_SYMBOL(kmem_cache_charge);
*/
void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t gfpflags, int node)
{
- void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, s->object_size);
+ void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = s->object_size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
+ ret = slab_alloc_node(s, gfpflags, node, &ac);
trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, node);
@@ -5335,6 +5348,11 @@ void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
{
struct kmem_cache *s;
void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = caller,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
ret = __kmalloc_large_node_noprof(size, flags, node);
@@ -5348,7 +5366,7 @@ void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
s = kmalloc_slab(size, b, flags, token);
- ret = slab_alloc_node(s, NULL, flags, node, caller, size);
+ ret = slab_alloc_node(s, flags, node, &ac);
ret = kasan_kmalloc(s, ret, size, flags);
trace_kmalloc(caller, ret, size, s->size, flags, node);
return ret;
@@ -5467,8 +5485,14 @@ EXPORT_SYMBOL(__kmalloc_node_track_caller_noprof);
void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t gfpflags, size_t size)
{
- void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE,
- _RET_IP_, size);
+ void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
+ ret = slab_alloc_node(s, gfpflags, NUMA_NO_NODE, &ac);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
@@ -5480,7 +5504,14 @@ EXPORT_SYMBOL(__kmalloc_cache_noprof);
void *__kmalloc_cache_node_noprof(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size)
{
- void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, size);
+ void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
+ ret = slab_alloc_node(s, gfpflags, node, &ac);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, node);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 11/16] mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (9 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 10/16] mm/slab: replace slab_alloc_node() parameters with slab_alloc_context Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 12/16] mm/slab: pass slab_alloc_context to __do_kmalloc_node() Vlastimil Babka (SUSE)
` (4 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
The last user of gfpflags_allow_spinning() in slab is
alloc_from_pcs_bulk(), which is only called from
kmem_cache_alloc_bulk().
It turns out that gfpflags_allow_spinning() is not necessary, because
kmem_cache_alloc_bulk() is only expected to be called from context that
does allow spinning, so simply replace it with 'true'.
With that, we can remove the "@flags must allow spinning" part of the
kernel doc, as there is no more connection to the gfp flags in the slab
implementation.
Also remove a comment in alloc_slab_obj_exts() because there should be
no more false positives possible due to gfp_allowed_mask during early
boot.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 0b9974bfcb24..ef457e07db83 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2171,12 +2171,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
sz = obj_exts_alloc_size(s, slab, gfp);
- /*
- * Note that allow_spin may be false during early boot and its
- * restricted GFP_BOOT_MASK. Due to kmalloc_nolock() only supporting
- * architectures with cmpxchg16b, early obj_exts will be missing for
- * very early allocations on those.
- */
if (unlikely(!allow_spin))
vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
slab_nid(slab));
@@ -4867,7 +4861,7 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size,
}
full = barn_replace_empty_sheaf(barn, pcs->main,
- gfpflags_allow_spinning(gfp));
+ /* allow_spin = */ true);
if (full) {
stat(s, BARN_GET);
@@ -7333,8 +7327,7 @@ static bool __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
* Allocate @size objects from @s and places them into @p. @size must be larger
* than 0.
*
- * Interrupts must be enabled when calling this function and @flags must allow
- * spinning.
+ * Interrupts must be enabled when calling this function.
*
* Unlike alloc_pages_bulk(), this function does not check for already allocated
* objects in @p, and thus the caller does not need to zero it.
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 12/16] mm/slab: pass slab_alloc_context to __do_kmalloc_node()
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (10 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 11/16] mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 13/16] mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock() Vlastimil Babka (SUSE)
` (3 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
With alloc_flags usage in slab, we can replace __GFP_NO_OBJ_EXT with an
alloc flag that prevents kmalloc recursion. For that we need a version
of kmalloc() that takes alloc_flags and use it in places that perform
these potentially recursive kmalloc allocations (of sheaves or obj_ext
arrays).
As a preparatory step, make __do_kmalloc_node() take a pointer to
slab_alloc_context. This replaces the 'caller' parameter and includes
alloc_flags which we'll make use of.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 47 ++++++++++++++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 15 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index ef457e07db83..6845e15c148a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5338,19 +5338,14 @@ EXPORT_SYMBOL(__kmalloc_large_node_noprof);
static __always_inline
void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
- unsigned long caller, kmalloc_token_t token)
+ kmalloc_token_t token, struct slab_alloc_context *ac)
{
struct kmem_cache *s;
void *ret;
- struct slab_alloc_context ac = {
- .caller_addr = caller,
- .orig_size = size,
- .alloc_flags = SLAB_ALLOC_DEFAULT,
- };
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
ret = __kmalloc_large_node_noprof(size, flags, node);
- trace_kmalloc(caller, ret, size,
+ trace_kmalloc(ac->caller_addr, ret, size,
PAGE_SIZE << get_order(size), flags, node);
return ret;
}
@@ -5360,22 +5355,34 @@ void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
s = kmalloc_slab(size, b, flags, token);
- ret = slab_alloc_node(s, flags, node, &ac);
+ ret = slab_alloc_node(s, flags, node, ac);
ret = kasan_kmalloc(s, ret, size, flags);
- trace_kmalloc(caller, ret, size, s->size, flags, node);
+ trace_kmalloc(ac->caller_addr, ret, size, s->size, flags, node);
return ret;
}
void *__kmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), gfp_t flags, int node)
{
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
return __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), flags, node,
- _RET_IP_, PASS_TOKEN_PARAM(token));
+ PASS_TOKEN_PARAM(token), &ac);
}
EXPORT_SYMBOL(__kmalloc_node_noprof);
void *__kmalloc_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t flags)
{
- return __do_kmalloc_node(size, NULL, flags, NUMA_NO_NODE, _RET_IP_,
- PASS_TOKEN_PARAM(token));
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+
+ return __do_kmalloc_node(size, NULL, flags, NUMA_NO_NODE,
+ PASS_TOKEN_PARAM(token), &ac);
}
EXPORT_SYMBOL(__kmalloc_noprof);
@@ -5471,9 +5478,14 @@ EXPORT_SYMBOL_GPL(_kmalloc_nolock_noprof);
void *__kmalloc_node_track_caller_noprof(DECL_KMALLOC_PARAMS(size, b, token), gfp_t flags,
int node, unsigned long caller)
{
- return __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), flags, node,
- caller, PASS_TOKEN_PARAM(token));
+ struct slab_alloc_context ac = {
+ .caller_addr = caller,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
+ return __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), flags, node,
+ PASS_TOKEN_PARAM(token), &ac);
}
EXPORT_SYMBOL(__kmalloc_node_track_caller_noprof);
@@ -6874,6 +6886,11 @@ void *__kvmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), unsigned long
{
bool allow_block;
void *ret;
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_DEFAULT,
+ };
/*
* It doesn't really make sense to fallback to vmalloc for sub page
@@ -6881,7 +6898,7 @@ void *__kvmalloc_node_noprof(DECL_KMALLOC_PARAMS(size, b, token), unsigned long
*/
ret = __do_kmalloc_node(size, PASS_BUCKET_PARAM(b),
kmalloc_gfp_adjust(flags, size),
- node, _RET_IP_, PASS_TOKEN_PARAM(token));
+ node, PASS_TOKEN_PARAM(token), &ac);
if (ret || size <= PAGE_SIZE)
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 13/16] mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock()
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (11 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 12/16] mm/slab: pass slab_alloc_context to __do_kmalloc_node() Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 14/16] mm/slab: introduce kmalloc_flags() Vlastimil Babka (SUSE)
` (2 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
The two flags are added internally so there's no point for warning if
they are passed by the caller as well, so allow them. This will allow
simplifying obj_ext allocation under kmalloc_nolock().
Also it's not necessary to have the extra alloc_gfp variable for adding
the two flags. The original gfp_flags parameter is not used anywhere
except for the warning. So remove alloc_gfp and directly modify and use
gfp_flags everywhere.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
include/linux/slab.h | 3 ++-
mm/slub.c | 19 ++++++++++---------
2 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index ce1c867dc0ba..b955f3cbb732 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -1040,7 +1040,8 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
* kmalloc_nolock - Allocate an object of given size from any context.
* @size: size to allocate
* @gfp_flags: GFP flags. Only __GFP_ACCOUNT, __GFP_ZERO, __GFP_NO_OBJ_EXT
- * allowed.
+ * allowed. Also __GFP_NOWARN and __GFP_NOMEMALLOC are allowed but added
+ * internally thus not necessary.
* @node: node number of the target node.
*
* Return: pointer to the new object or NULL in case of error.
diff --git a/mm/slub.c b/mm/slub.c
index 6845e15c148a..847cad5203b2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5388,7 +5388,6 @@ EXPORT_SYMBOL(__kmalloc_noprof);
void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
{
- gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
size_t orig_size = size;
unsigned int alloc_flags = SLAB_ALLOC_TRYLOCK;
struct kmem_cache *s;
@@ -5396,7 +5395,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
void *ret;
VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO |
- __GFP_NO_OBJ_EXT));
+ __GFP_NO_OBJ_EXT | __GFP_NOWARN | __GFP_NOMEMALLOC));
+
+ gfp_flags |= __GFP_NOWARN | __GFP_NOMEMALLOC;
if (unlikely(!size))
return ZERO_SIZE_PTR;
@@ -5415,7 +5416,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
retry:
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
return NULL;
- s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
+ s = kmalloc_slab(size, NULL, gfp_flags, PASS_TOKEN_PARAM(token));
if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
/*
@@ -5429,7 +5430,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
*/
return NULL;
- ret = alloc_from_pcs(s, alloc_gfp, alloc_flags, node);
+ ret = alloc_from_pcs(s, gfp_flags, alloc_flags, node);
if (ret)
goto success;
@@ -5445,7 +5446,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
* kfence_alloc. Hence call __slab_alloc_node() (at most twice)
* and slab_post_alloc_hook() directly.
*/
- ret = __slab_alloc_node(s, alloc_gfp, node, &ac);
+ ret = __slab_alloc_node(s, gfp_flags, node, &ac);
/*
* It's possible we failed due to trylock as we preempted someone with
@@ -5458,8 +5459,8 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
size = s->object_size + 1;
/*
* Another alternative is to
- * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
- * else if (!memcg) alloc_gfp |= __GFP_ACCOUNT;
+ * if (memcg) gfp_flags &= ~__GFP_ACCOUNT;
+ * else if (!memcg) gfp_flags |= __GFP_ACCOUNT;
* to retry from bucket of the same size.
*/
can_retry = false;
@@ -5468,9 +5469,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
success:
maybe_wipe_obj_freeptr(s, ret);
- slab_post_alloc_hook(s, alloc_gfp, 1, &ret, &ac);
+ slab_post_alloc_hook(s, gfp_flags, 1, &ret, &ac);
- ret = kasan_kmalloc(s, ret, orig_size, alloc_gfp);
+ ret = kasan_kmalloc(s, ret, orig_size, gfp_flags);
return ret;
}
EXPORT_SYMBOL_GPL(_kmalloc_nolock_noprof);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 14/16] mm/slab: introduce kmalloc_flags()
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (12 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 13/16] mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock() Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 15/16] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts() Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 16/16] mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves Vlastimil Babka (SUSE)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
With alloc_flags usage in slab, we can replace __GFP_NO_OBJ_EXT with an
alloc flag that prevents kmalloc recursion. For that we need a version
of kmalloc() that takes alloc_flags and use it in places that perform
these potentially recursive kmalloc allocations (of sheaves or obj_ext
arrays).
Add this function, named kmalloc_flags(). Right now it's only useful for
these nested allocations, so it doesn't need to optimize build-time
constant sizes like kmalloc() or kmalloc_buckets.
Since we need it to support both normal and non-spinning
kmalloc_nolock() context through the SLAB_ALLOC_TRYLOCK flag, split out
most of the special _kmalloc_nolock_noprof() implementation to
__kmalloc_nolock_noprof() that takes a slab_alloc_context, and make
_kmalloc_nolock_noprof() a simple tail calling wrapper with the proper
context.
kmalloc_flags() can thus determine whether to call
__kmalloc_nolock_noprof() or __do_kmalloc_node(), based on the
given alloc_flags.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slab.h | 13 +++++++++++++
mm/slub.c | 56 +++++++++++++++++++++++++++++++++++++++++++-------------
2 files changed, 56 insertions(+), 13 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index 4db6d8aa0ee3..45bfcfb35a9c 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -11,6 +11,7 @@
#include <linux/memcontrol.h>
#include <linux/kfence.h>
#include <linux/kasan.h>
+#include <linux/slab.h>
/*
* Internal slab definitions
@@ -26,6 +27,18 @@ static inline bool alloc_flags_allow_spinning(const unsigned int alloc_flags)
return !(alloc_flags & SLAB_ALLOC_TRYLOCK);
}
+void *__kmalloc_flags_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t flags,
+ unsigned int alloc_flags, int node)
+ __assume_kmalloc_alignment __alloc_size(1);
+
+static __always_inline __alloc_size(1) void *_kmalloc_flags_noprof(size_t size,
+ gfp_t flags, unsigned int alloc_flags, int node, kmalloc_token_t token)
+{
+ return __kmalloc_flags_noprof(PASS_TOKEN_PARAMS(size, token), flags, alloc_flags, node);
+}
+#define kmalloc_flags_noprof(...) _kmalloc_flags_noprof(__VA_ARGS__, __kmalloc_token(__VA_ARGS__))
+#define kmalloc_flags(...) alloc_hooks(kmalloc_flags_noprof(__VA_ARGS__))
+
#ifdef CONFIG_64BIT
# ifdef system_has_cmpxchg128
# define system_has_freelist_aba() system_has_cmpxchg128()
diff --git a/mm/slub.c b/mm/slub.c
index 847cad5203b2..cbb38bd01e46 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5386,14 +5386,14 @@ void *__kmalloc_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t flags)
}
EXPORT_SYMBOL(__kmalloc_noprof);
-void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
+static void *__kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags,
+ int node, struct slab_alloc_context *ac)
{
- size_t orig_size = size;
- unsigned int alloc_flags = SLAB_ALLOC_TRYLOCK;
struct kmem_cache *s;
bool can_retry = true;
void *ret;
+ VM_WARN_ON_ONCE(alloc_flags_allow_spinning(ac->alloc_flags));
VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO |
__GFP_NO_OBJ_EXT | __GFP_NOWARN | __GFP_NOMEMALLOC));
@@ -5430,23 +5430,17 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
*/
return NULL;
- ret = alloc_from_pcs(s, gfp_flags, alloc_flags, node);
+ ret = alloc_from_pcs(s, gfp_flags, ac->alloc_flags, node);
if (ret)
goto success;
- struct slab_alloc_context ac = {
- .caller_addr = _RET_IP_,
- .orig_size = orig_size,
- .alloc_flags = alloc_flags,
- };
-
/*
* Do not call slab_alloc_node(), since trylock mode isn't
* compatible with slab_pre_alloc_hook/should_failslab and
* kfence_alloc. Hence call __slab_alloc_node() (at most twice)
* and slab_post_alloc_hook() directly.
*/
- ret = __slab_alloc_node(s, gfp_flags, node, &ac);
+ ret = __slab_alloc_node(s, gfp_flags, node, ac);
/*
* It's possible we failed due to trylock as we preempted someone with
@@ -5469,11 +5463,23 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
success:
maybe_wipe_obj_freeptr(s, ret);
- slab_post_alloc_hook(s, gfp_flags, 1, &ret, &ac);
+ slab_post_alloc_hook(s, gfp_flags, 1, &ret, ac);
- ret = kasan_kmalloc(s, ret, orig_size, gfp_flags);
+ ret = kasan_kmalloc(s, ret, ac->orig_size, gfp_flags);
return ret;
}
+
+void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
+{
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = SLAB_ALLOC_TRYLOCK,
+ };
+
+ return __kmalloc_nolock_noprof(PASS_TOKEN_PARAMS(size, token),
+ gfp_flags, node, &ac);
+}
EXPORT_SYMBOL_GPL(_kmalloc_nolock_noprof);
void *__kmalloc_node_track_caller_noprof(DECL_KMALLOC_PARAMS(size, b, token), gfp_t flags,
@@ -5527,6 +5533,30 @@ void *__kmalloc_cache_node_noprof(struct kmem_cache *s, gfp_t gfpflags,
}
EXPORT_SYMBOL(__kmalloc_cache_node_noprof);
+/*
+ * The only version of kmalloc_node() that takes alloc_flags and thus can
+ * determine on its own whether to handle the allocation via kmalloc_nolock() or
+ * normally
+ */
+void *__kmalloc_flags_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t flags,
+ unsigned int alloc_flags, int node)
+{
+ struct slab_alloc_context ac = {
+ .caller_addr = _RET_IP_,
+ .orig_size = size,
+ .alloc_flags = alloc_flags,
+ };
+
+ if (alloc_flags_allow_spinning(alloc_flags)) {
+ return __do_kmalloc_node(size, NULL, flags, node,
+ PASS_TOKEN_PARAM(token), &ac);
+ } else {
+ return __kmalloc_nolock_noprof(PASS_TOKEN_PARAMS(size, token),
+ flags, node, &ac);
+ }
+}
+
+
static noinline void free_to_partial_list(
struct kmem_cache *s, struct slab *slab,
void *head, void *tail, int bulk_cnt,
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 15/16] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts()
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (13 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 14/16] mm/slab: introduce kmalloc_flags() Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 16/16] mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves Vlastimil Babka (SUSE)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
__GFP_NO_OBJ_EXT has limited scope within the slab allocator itself and
gfp flags are a scarce resource, unlike slab's alloc_flags.
Introduce SLAB_ALLOC_NO_RECURSE alloc flag that has the same intent as
__GFP_NO_OBJ_EXT but a more generic name, meaning that a kmalloc()
family function should not recurse into another kmalloc*() for the
purposes of allocating auxiliary structures (obj_ext arrays or sheaves).
First, replace the __GFP_NO_OBJ_EXT for allocating obj_ext arrays in
alloc_slab_obj_exts(). Make use of the newly added kmalloc_flags()
function, where we can pass alloc_flags with SLAB_ALLOC_NO_RECURSE
added. This will also pass through SLAB_ALLOC_TRYLOCK so we don't need
to special case kmalloc_nolock() anymore.
Note that until now the kmalloc_nolock() ignored the incoming gfp flags
and hardcoded __GFP_ZERO | __GFP_NO_OBJ_EXT. But it's correct to pass on
the incoming gfp flags (only augmented with __GFP_ZERO), because if
alloc_flags contain SLAB_ALLOC_TRYLOCK, the incoming gfp flags have to
be also compatible with it.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slab.h | 1 +
mm/slub.c | 13 +++++--------
2 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index 45bfcfb35a9c..509f330654b8 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -21,6 +21,7 @@
#define SLAB_ALLOC_DEFAULT 0x00 /* no flags */
#define SLAB_ALLOC_TRYLOCK 0x01 /* a kmalloc_nolock() allocation */
#define SLAB_ALLOC_NEW_SLAB 0x02 /* a flag for alloc_slab_obj_exts() */
+#define SLAB_ALLOC_NO_RECURSE 0x04 /* prevent kmalloc() recursion */
static inline bool alloc_flags_allow_spinning(const unsigned int alloc_flags)
{
diff --git a/mm/slub.c b/mm/slub.c
index cbb38bd01e46..7dfbd0251aa2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2167,15 +2167,12 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
gfp &= ~OBJCGS_CLEAR_MASK;
/* Prevent recursive extension vector allocation */
- gfp |= __GFP_NO_OBJ_EXT;
+ alloc_flags |= SLAB_ALLOC_NO_RECURSE;
sz = obj_exts_alloc_size(s, slab, gfp);
- if (unlikely(!allow_spin))
- vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
- slab_nid(slab));
- else
- vec = kmalloc_node(sz, gfp | __GFP_ZERO, slab_nid(slab));
+ /* This will use kmalloc_nolock() if alloc_flags say so */
+ vec = kmalloc_flags(sz, gfp | __GFP_ZERO, alloc_flags, slab_nid(slab));
if (!vec) {
/*
@@ -2251,7 +2248,7 @@ static inline void free_slab_obj_exts(struct slab *slab, bool allow_spin)
}
/*
- * obj_exts was created with __GFP_NO_OBJ_EXT flag, therefore its
+ * obj_exts was created with SLAB_ALLOC_NO_RECURSE flag, therefore its
* corresponding extension will be NULL. alloc_tag_sub() will throw a
* warning if slab has extensions but the extension of an object is
* NULL, therefore replace NULL with CODETAG_EMPTY to indicate that
@@ -2374,7 +2371,7 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags,
if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE))
return;
- if (flags & __GFP_NO_OBJ_EXT)
+ if (alloc_flags & SLAB_ALLOC_NO_RECURSE)
return;
slab = virt_to_slab(object);
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v2 16/16] mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
` (14 preceding siblings ...)
2026-06-10 15:40 ` [PATCH v2 15/16] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts() Vlastimil Babka (SUSE)
@ 2026-06-10 15:40 ` Vlastimil Babka (SUSE)
15 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-06-10 15:40 UTC (permalink / raw)
To: Harry Yoo
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Shakeel Butt, Alexander Potapenko,
Marco Elver, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel,
cgroups, Vlastimil Babka (SUSE)
Finish the switch away from __GFP_NO_OBJ_EXT by replacing it with
SLAB_ALLOC_NO_RECURSE when allocating empty sheaves. Pass alloc_flags to
[__]alloc_empty_sheaf(). Callers that can't be part of a recursive
kmalloc() chain simply pass SLAB_ALLOC_DEFAULT. Use kmalloc_flags()
instead of kzalloc() for allocating the sheaf.
This leaves __GFP_NO_OBJ_EXT with no users in slab, so stop allowing the
flag in kmalloc_nolock().
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
include/linux/slab.h | 6 +++---
mm/slub.c | 31 ++++++++++++++++---------------
2 files changed, 19 insertions(+), 18 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index b955f3cbb732..43c3d9b51107 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -1039,9 +1039,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
/**
* kmalloc_nolock - Allocate an object of given size from any context.
* @size: size to allocate
- * @gfp_flags: GFP flags. Only __GFP_ACCOUNT, __GFP_ZERO, __GFP_NO_OBJ_EXT
- * allowed. Also __GFP_NOWARN and __GFP_NOMEMALLOC are allowed but added
- * internally thus not necessary.
+ * @gfp_flags: GFP flags. Only __GFP_ACCOUNT and __GFP_ZERO allowed. Also
+ * __GFP_NOWARN and __GFP_NOMEMALLOC are allowed but added internally thus not
+ * necessary.
* @node: node number of the target node.
*
* Return: pointer to the new object or NULL in case of error.
diff --git a/mm/slub.c b/mm/slub.c
index 7dfbd0251aa2..5d7ea72ebebd 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2756,7 +2756,7 @@ static inline void *setup_object(struct kmem_cache *s, void *object)
}
static struct slab_sheaf *__alloc_empty_sheaf(struct kmem_cache *s, gfp_t gfp,
- unsigned int capacity)
+ unsigned int alloc_flags, unsigned int capacity)
{
struct slab_sheaf *sheaf;
size_t sheaf_size;
@@ -2767,10 +2767,10 @@ static struct slab_sheaf *__alloc_empty_sheaf(struct kmem_cache *s, gfp_t gfp,
* bucket)
*/
if (s->flags & SLAB_KMALLOC)
- gfp |= __GFP_NO_OBJ_EXT;
+ alloc_flags |= SLAB_ALLOC_NO_RECURSE;
sheaf_size = struct_size(sheaf, objects, capacity);
- sheaf = kzalloc(sheaf_size, gfp);
+ sheaf = kmalloc_flags(sheaf_size, gfp | __GFP_ZERO, alloc_flags, NUMA_NO_NODE);
if (unlikely(!sheaf))
return NULL;
@@ -2783,20 +2783,20 @@ static struct slab_sheaf *__alloc_empty_sheaf(struct kmem_cache *s, gfp_t gfp,
}
static inline struct slab_sheaf *alloc_empty_sheaf(struct kmem_cache *s,
- gfp_t gfp)
+ gfp_t gfp, unsigned int alloc_flags)
{
- if (gfp & __GFP_NO_OBJ_EXT)
+ if (alloc_flags & SLAB_ALLOC_NO_RECURSE)
return NULL;
gfp &= ~OBJCGS_CLEAR_MASK;
- return __alloc_empty_sheaf(s, gfp, s->sheaf_capacity);
+ return __alloc_empty_sheaf(s, gfp, alloc_flags, s->sheaf_capacity);
}
static void free_empty_sheaf(struct kmem_cache *s, struct slab_sheaf *sheaf)
{
/*
- * If the sheaf was created with __GFP_NO_OBJ_EXT flag then its
+ * If the sheaf was created with SLAB_ALLOC_NO_RECURSE flag then its
* corresponding extension is NULL and alloc_tag_sub() will throw a
* warning, therefore replace NULL with CODETAG_EMPTY to indicate
* that the extension for this sheaf is expected to be NULL.
@@ -4689,7 +4689,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
return NULL;
if (!empty) {
- empty = alloc_empty_sheaf(s, gfp);
+ empty = alloc_empty_sheaf(s, gfp, alloc_flags);
if (!empty)
return NULL;
}
@@ -5063,7 +5063,7 @@ kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size)
if (unlikely(size > s->sheaf_capacity)) {
- sheaf = __alloc_empty_sheaf(s, gfp, size);
+ sheaf = __alloc_empty_sheaf(s, gfp, SLAB_ALLOC_DEFAULT, size);
if (!sheaf)
return NULL;
@@ -5108,7 +5108,7 @@ kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size)
if (!sheaf)
- sheaf = alloc_empty_sheaf(s, gfp);
+ sheaf = alloc_empty_sheaf(s, gfp, SLAB_ALLOC_DEFAULT);
if (sheaf) {
sheaf->capacity = s->sheaf_capacity;
@@ -5392,7 +5392,7 @@ static void *__kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_f
VM_WARN_ON_ONCE(alloc_flags_allow_spinning(ac->alloc_flags));
VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO |
- __GFP_NO_OBJ_EXT | __GFP_NOWARN | __GFP_NOMEMALLOC));
+ __GFP_NOWARN | __GFP_NOMEMALLOC));
gfp_flags |= __GFP_NOWARN | __GFP_NOMEMALLOC;
@@ -5907,7 +5907,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
if (!allow_spin)
return NULL;
- empty = alloc_empty_sheaf(s, GFP_NOWAIT);
+ empty = alloc_empty_sheaf(s, GFP_NOWAIT, SLAB_ALLOC_DEFAULT);
if (empty)
goto got_empty;
@@ -6091,7 +6091,7 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
local_unlock(&s->cpu_sheaves->lock);
- empty = alloc_empty_sheaf(s, GFP_NOWAIT);
+ empty = alloc_empty_sheaf(s, GFP_NOWAIT, SLAB_ALLOC_DEFAULT);
if (!empty)
goto fail;
@@ -7636,7 +7636,7 @@ static int init_percpu_sheaves(struct kmem_cache *s)
if (!s->sheaf_capacity)
pcs->main = &bootstrap_sheaf;
else
- pcs->main = alloc_empty_sheaf(s, GFP_KERNEL);
+ pcs->main = alloc_empty_sheaf(s, GFP_KERNEL, SLAB_ALLOC_DEFAULT);
if (!pcs->main)
return -ENOMEM;
@@ -8502,7 +8502,8 @@ static void __init bootstrap_cache_sheaves(struct kmem_cache *s)
pcs = per_cpu_ptr(s->cpu_sheaves, cpu);
- pcs->main = __alloc_empty_sheaf(s, GFP_KERNEL, capacity);
+ pcs->main = __alloc_empty_sheaf(s, GFP_KERNEL,
+ SLAB_ALLOC_DEFAULT, capacity);
if (!pcs->main) {
failed = true;
--
2.54.0
^ permalink raw reply related [flat|nested] 17+ messages in thread