From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: Vlastimil Babka <vbabka@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Hao Li <hao.li@linux.dev>, Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Suren Baghdasaryan <surenb@google.com>,
"Liam R. Howlett" <liam@infradead.org>
Subject: [PATCH RFC 5/8] mm/slab: rework cache_has_sheaves() to check immutable properties only
Date: Sat, 16 May 2026 01:24:29 +0900 [thread overview]
Message-ID: <20260516-sheaves-tuning-v1-5-221aa3e1d829@kernel.org> (raw)
In-Reply-To: <20260516-sheaves-tuning-v1-0-221aa3e1d829@kernel.org>
Currently the sheaf capacity is determined when a cache is created and
never changes, with normal kmalloc caches as the only exception.
Checking whether s->sheaf_capacity is non-zero is therefore
sufficient for cache_has_sheaves() to work correctly.
However, once s->sheaf_capacity becomes mutable at runtime, both the
name and the implementation become confusing and racy: a cache that
currently has sheaves may have them disabled at runtime, or vice versa.
Except for normal kmalloc caches, what callers of cache_has_sheaves()
actually want to know depends only on properties that do not change:
1. Whether the cache has certain flags (SLAB_NO_OBJ_EXT,
SLAB_NOLEAKTRACE, SLAB_DEBUG_FLAGS)
2. Whether a certain build option is enabled (CONFIG_SLUB_TINY)
Since these never change at runtime, check them directly instead of
going through s->sheaf_capacity. To avoid confusion, rename
cache_has_sheaves() to cache_supports_sheaves().
Normal kmalloc caches need special handling. They don't have sheaves
initially and only get them later via bootstrap_kmalloc_sheaves().
That said, cache_supports_sheaves() can return true while a cache's
percpu sheaves still point at the shared bootstrap_sheaf.
This special handling might sound like it applies only to normal
kmalloc caches, but the same handling is needed when sheaf capacity
can change.
The existing callers of cache_has_sheaves() fall into two categories.
The first category performs operations on the whole cache:
kvfree_rcu barrier, cache destruction, sheaf flushing, and CPU/memory
hot(un)plug. These should not skip caches that support sheaves, no
matter whether they actually have sheaves. If such an operation actually
needs to access percpu sheaves, use the new pcs_has_sheaves() helper
to skip CPUs whose pcs->main points to the bootstrap_sheaf.
The second category allocates from or frees to percpu sheaves directly
(in the slowpath). These should confirm pcs_has_sheaves() returns true
before proceeding.
In addition, init_kmem_cache_nodes() skips barn allocation for normal
kmalloc caches. Their barns are set up later by
bootstrap_kmalloc_sheaves().
Change calculate_sheaf_capacity() to call cache_supports_sheaves()
directly instead of open-coding the same conditions.
Signed-off-by: Harry Yoo (Oracle) <harry@kernel.org>
---
mm/slab.h | 36 ++++++++++++++++++++++++
mm/slab_common.c | 2 +-
mm/slub.c | 85 ++++++++++++++++++++++++++++++++------------------------
3 files changed, 86 insertions(+), 37 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index dfbe73011cb8..907a8207809c 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -481,6 +481,42 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla
return false;
}
+static inline bool kmem_cache_debug(struct kmem_cache *s)
+{
+ return kmem_cache_debug_flags(s, SLAB_DEBUG_FLAGS);
+}
+
+/*
+ * Every cache has !NULL s->cpu_sheaves but they may point to the
+ * bootstrap_sheaf temporarily during init, or permanently for the boot caches
+ * and caches with debugging enabled, or all caches with CONFIG_SLUB_TINY. This
+ * helper distinguishes whether cache supports real non-bootstrap sheaves.
+ *
+ * Return false when the cache does not support sheaves.
+ *
+ * When it returns true, the cache may or may not have sheaves.
+ * Callers who access percpu sheaves must verify that they actually have
+ * sheaves enabled.
+ */
+static inline bool cache_supports_sheaves(struct kmem_cache *s)
+{
+ if (IS_ENABLED(CONFIG_SLUB_TINY))
+ return false;
+
+ if (kmem_cache_debug(s))
+ return false;
+ /*
+ * Bootstrap caches can't have sheaves for now (SLAB_NO_OBJ_EXT).
+ * SLAB_NOLEAKTRACE caches (e.g., kmemleak's object_cache) must not
+ * have sheaves to avoid recursion when sheaf allocation triggers
+ * kmemleak tracking.
+ */
+ if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE))
+ return false;
+
+ return true;
+}
+
#if IS_ENABLED(CONFIG_SLUB_DEBUG) && IS_ENABLED(CONFIG_KUNIT)
bool slab_in_kunit_test(void);
#else
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..3092c1c3f284 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -2109,7 +2109,7 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
*/
void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
{
- if (cache_has_sheaves(s)) {
+ if (cache_supports_sheaves(s)) {
flush_rcu_sheaves_on_cache(s);
rcu_barrier();
}
diff --git a/mm/slub.c b/mm/slub.c
index fb98d0da5c78..c746c9b48728 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -238,11 +238,6 @@ struct slab_obj_iter {
#endif
};
-static inline bool kmem_cache_debug(struct kmem_cache *s)
-{
- return kmem_cache_debug_flags(s, SLAB_DEBUG_FLAGS);
-}
-
void *fixup_red_left(struct kmem_cache *s, void *p)
{
if (kmem_cache_debug_flags(s, SLAB_RED_ZONE))
@@ -432,6 +427,23 @@ struct slub_percpu_sheaves {
struct slab_sheaf *rcu_free; /* for batching kfree_rcu() */
};
+static struct slab_sheaf bootstrap_sheaf = {};
+
+static inline bool pcs_has_sheaves_unlocked(struct slub_percpu_sheaves *pcs)
+{
+ /* Test CONFIG_SLUB_TINY for code elimination purposes */
+ if (IS_ENABLED(CONFIG_SLUB_TINY))
+ return false;
+
+ return unlikely(pcs->main != &bootstrap_sheaf);
+}
+
+static inline bool pcs_has_sheaves(struct slub_percpu_sheaves *pcs)
+{
+ lockdep_assert_held(&pcs->lock);
+ return pcs_has_sheaves_unlocked(pcs);
+}
+
/*
* The slab lists for all objects.
*/
@@ -3045,8 +3057,7 @@ static void pcs_destroy(struct kmem_cache *s)
if (!s->cpu_sheaves)
return;
- /* pcs->main can only point to the bootstrap sheaf, nothing to free */
- if (!cache_has_sheaves(s))
+ if (!cache_supports_sheaves(s))
goto free_pcs;
for_each_possible_cpu(cpu) {
@@ -3058,6 +3069,9 @@ static void pcs_destroy(struct kmem_cache *s)
if (!pcs->main)
continue;
+ if (!pcs_has_sheaves_unlocked(pcs))
+ continue;
+
/*
* We have already passed __kmem_cache_shutdown() so everything
* was flushed and there should be no objects allocated from
@@ -3949,7 +3963,7 @@ static bool has_pcs_used(int cpu, struct kmem_cache *s)
{
struct slub_percpu_sheaves *pcs;
- if (!cache_has_sheaves(s))
+ if (!cache_supports_sheaves(s))
return false;
pcs = per_cpu_ptr(s->cpu_sheaves, cpu);
@@ -3971,7 +3985,7 @@ static void flush_cpu_sheaves(struct work_struct *w)
s = sfw->s;
- if (cache_has_sheaves(s))
+ if (cache_supports_sheaves(s))
pcs_flush_all(s);
}
@@ -4074,7 +4088,7 @@ void flush_all_rcu_sheaves(void)
mutex_lock(&slab_mutex);
list_for_each_entry(s, &slab_caches, list) {
- if (!cache_has_sheaves(s))
+ if (!cache_supports_sheaves(s))
continue;
flush_rcu_sheaves_on_cache(s);
}
@@ -4109,7 +4123,7 @@ static int slub_cpu_setup(unsigned int cpu)
/*
* barn might already exist if a previous callback failed midway
*/
- if (!cache_has_sheaves(s) || get_barn_node(s, nid))
+ if (!cache_supports_sheaves(s) || get_barn_node(s, nid))
continue;
barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, nid);
@@ -4140,7 +4154,7 @@ static int slub_cpu_dead(unsigned int cpu)
mutex_lock(&slab_mutex);
list_for_each_entry(s, &slab_caches, list) {
- if (cache_has_sheaves(s))
+ if (cache_supports_sheaves(s))
__pcs_flush_all_cpu(s, cpu);
}
mutex_unlock(&slab_mutex);
@@ -4612,8 +4626,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
- /* Bootstrap or debug cache, back off */
- if (unlikely(!cache_has_sheaves(s))) {
+ /* Sheaves are not supported or disabled for this cache */
+ if (unlikely(!pcs_has_sheaves(pcs))) {
local_unlock(&s->cpu_sheaves->lock);
return NULL;
}
@@ -4809,7 +4823,7 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size,
struct slab_sheaf *full;
struct node_barn *barn;
- if (unlikely(!cache_has_sheaves(s))) {
+ if (unlikely(!pcs_has_sheaves(pcs))) {
local_unlock(&s->cpu_sheaves->lock);
return allocated;
}
@@ -5727,8 +5741,8 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
restart:
lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
- /* Bootstrap or debug cache, back off */
- if (unlikely(!cache_has_sheaves(s))) {
+ /* Sheaves are not supported or disabled for this cache */
+ if (unlikely(!pcs_has_sheaves(pcs))) {
local_unlock(&s->cpu_sheaves->lock);
return NULL;
}
@@ -5959,8 +5973,8 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
struct slab_sheaf *empty;
struct node_barn *barn;
- /* Bootstrap or debug cache, fall back */
- if (unlikely(!cache_has_sheaves(s))) {
+ /* Sheaves are not supported or disabled for this cache */
+ if (unlikely(!pcs_has_sheaves(pcs))) {
local_unlock(&s->cpu_sheaves->lock);
goto fail;
}
@@ -6138,6 +6152,11 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
pcs = this_cpu_ptr(s->cpu_sheaves);
+ if (unlikely(!pcs_has_sheaves(pcs))) {
+ local_unlock(&s->cpu_sheaves->lock);
+ goto fallback;
+ }
+
if (likely(pcs->main->size < pcs->main->capacity))
goto do_free;
@@ -7131,7 +7150,7 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
* freeing to sheaves is so incompatible with the detached freelist so
* once we go that way, we have to do everything differently
*/
- if (s && cache_has_sheaves(s)) {
+ if (s && cache_supports_sheaves(s)) {
free_to_pcs_bulk(s, size, p);
return;
}
@@ -7600,7 +7619,6 @@ static inline int alloc_kmem_cache_stats(struct kmem_cache *s)
static int init_percpu_sheaves(struct kmem_cache *s)
{
- static struct slab_sheaf bootstrap_sheaf = {};
int cpu;
for_each_possible_cpu(cpu) {
@@ -7614,7 +7632,7 @@ static int init_percpu_sheaves(struct kmem_cache *s)
* Bootstrap sheaf has zero size so fast-path allocation fails.
* It has also size == sheaf->capacity, so fast-path free
* fails. In the slow paths we recognize the situation by
- * checking s->sheaf_capacity. This allows fast paths to assume
+ * pcs_has_sheaves(). This allows fast paths to assume
* s->cpu_sheaves and pcs->main always exists and are valid.
* It's also safe to share the single static bootstrap_sheaf
* with zero-sized objects array as it's never modified.
@@ -7631,6 +7649,7 @@ static int init_percpu_sheaves(struct kmem_cache *s)
if (!pcs->main)
return -ENOMEM;
+
}
return 0;
@@ -7740,7 +7759,11 @@ static int init_kmem_cache_nodes(struct kmem_cache *s)
s->per_node[node].node = n;
}
- if (slab_state == DOWN || !cache_has_sheaves(s))
+ if (slab_state == DOWN || !cache_supports_sheaves(s))
+ return 1;
+
+ /* Enable sheaves later to avoid the chicken and egg problem */
+ if (is_kmalloc_normal(s))
return 1;
for_each_node_mask(node, slab_barn_nodes) {
@@ -7765,17 +7788,7 @@ static unsigned short calculate_sheaf_capacity(struct kmem_cache *s,
unsigned short capacity;
size_t size;
-
- if (IS_ENABLED(CONFIG_SLUB_TINY) || s->flags & SLAB_DEBUG_FLAGS)
- return 0;
-
- /*
- * Bootstrap caches can't have sheaves for now (SLAB_NO_OBJ_EXT).
- * SLAB_NOLEAKTRACE caches (e.g., kmemleak's object_cache) must not
- * have sheaves to avoid recursion when sheaf allocation triggers
- * kmemleak tracking.
- */
- if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE))
+ if (!cache_supports_sheaves(s))
return 0;
/*
@@ -8040,7 +8053,7 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
flush_all_cpus_locked(s);
/* we might have rcu sheaves in flight */
- if (cache_has_sheaves(s))
+ if (cache_supports_sheaves(s))
rcu_barrier();
for_each_node(node) {
@@ -8361,7 +8374,7 @@ static int slab_mem_going_online_callback(int nid)
if (get_node(s, nid))
continue;
- if (cache_has_sheaves(s) && !get_barn_node(s, nid)) {
+ if (cache_supports_sheaves(s) && !get_barn_node(s, nid)) {
barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, nid);
--
2.43.0
next prev parent reply other threads:[~2026-05-15 16:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 16:24 [PATCH RFC 0/8] mm/slab: enable runtime sheaves tuning Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 1/8] mm/slab: do not store cache pointer in struct slab_sheaf Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 2/8] mm/slab: change sheaf_capacity type to unsigned short Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 3/8] mm/slab: track capacity per sheaf Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 4/8] mm/slab: allow bootstrap_cache_sheaves() to fail Harry Yoo (Oracle)
2026-05-15 16:24 ` Harry Yoo (Oracle) [this message]
2026-05-15 16:24 ` [PATCH RFC 6/8] mm/slab: allow changing sheaf_capacity at runtime Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 7/8] mm/slab: add pcs->lock lockdep assert when accessing the barn Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 8/8] mm/slab: allow changing max_{full,empty}_sheaves at runtime Harry Yoo (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260516-sheaves-tuning-v1-5-221aa3e1d829@kernel.org \
--to=harry@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox