From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: Vlastimil Babka <vbabka@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Hao Li <hao.li@linux.dev>, Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Suren Baghdasaryan <surenb@google.com>,
"Liam R. Howlett" <liam@infradead.org>
Subject: [PATCH RFC 0/8] mm/slab: enable runtime sheaves tuning
Date: Sat, 16 May 2026 01:24:24 +0900 [thread overview]
Message-ID: <20260516-sheaves-tuning-v1-0-221aa3e1d829@kernel.org> (raw)
Background
==========
Sheaves were introduced in v6.18, and starting from v7.0, they are
enabled for all slab caches (except for kmem_cache{,_node}). In the
pre-sheaves era, there was a cpu_partial parameter to tune the number
of objects cached per CPU. However, sheaves don't have an equivalent
and the sheaf capacity is determined in the kernel code.
The goal is to allow tuning sheaves at runtime by the next LTS.
Overview
========
This patchset does two main things:
1. Make the sheaf_capacity sysfs attribute writable so that the number
of objects cached per CPU can be changed at runtime, and
2. Expose MAX_FULL_SHEAVES and MAX_EMPTY_SHEAVES as sysfs attributes
rather than constants, so that users can tune them.
Measuring the performance impact of these tunables is TBD.
Roughly, the sequence to change sheaf_capacity is as follows:
1. Disable sheaves. Make all online CPUs replace their main sheaves
with the bootstrap sheaf under local_lock and wait for completion.
2. Wait for all in-flight RCU callbacks to be processed.
3. Flush and free all existing sheaves.
4. Re-enable sheaves with a new capacity.
Challenges
==========
1. Allocations and frees can happen concurrently at any point between
these steps, and we cannot introduce heavyweight synchronization
mechanisms on the fastpath.
2. Currently, cache_has_sheaves() checks whether a cache has sheaves.
This works now because sheaves cannot be enabled or disabled once
the cache is created.
The question "Does this cache has sheaves?" should be split into
"Does this cache support sheaves?" and "Does this CPU actually has
sheaves enabled right now?".
3. Once the sheaf capacity update is complete, no sheaf with stale
capacity must remain. Flushing and freeing all existing sheaves is
relatively simple, but under the current design it is quite
challenging to prevent sheaves with stale capacity to be installed
to the pcs or the barn. Reading s->sheaf_capacity without an
expensive synchronization primitive is racy.
Patch 6 introduces a copy of s->sheaf_capacity to struct
slub_percpu_sheaves to address this. pcs->capacity is copied from
s->sheaf_capacity and it is stable under local_lock. If
s->sheaf_capacity and pcs->capacity don't match, the sheaf_capacity
writer is responsible for flushing and freeing them before completing
the process.
Patch Sequence
==============
Patch 1-3: A per-sheaf capacity is required for the following steps,
but I didn't want to grow struct slab_sheaf. So patch 1 drops the cache
pointer (which was used only on the slowpath), patch 2 changs
sheaf_capacity from unsigned int to unsigned short, and patch 3 adds
per-sheaf capacity.
Actually, the size is shrunken after those patches.
After (24 bytes, excluding the objects flex array):
struct slab_sheaf {
union {
struct rcu_head rcu_head;
struct list_head barn_list;
bool pfmemalloc;
};
unsigned short capacity;
unsigned short size;
int node;
void *objects[];
};
Patch 4 allows bootstrap_cache_sheaves() to fail so that it can be
used to re-enable sheaves without panicking the kernel.
Patch 5 splits cache_has_sheaves() into cache_supports_sheaves()
and pcs_has_sheaves().
Patch 6 enables tuning the sheaf capacity at runtime.
Patch 7 adds lockdep asserts to verify the new rule "Always hold
local_lock when accessing the barn" to make sure there is no sheaf
with stale capacity.
Patch 8 turns MAX_FULL_SHEAVES and MAX_EMPTY_SHEAVES into sysfs
attributes (max_full_sheaves, max_empty_sheaves) and allows tuning.
RFC V1 is also available in git at:
https://git.kernel.org/pub/scm/linux/kernel/git/harry/linux.git/log/?h=sheaves-tuning-rfc-v1r1
Signed-off-by: Harry Yoo (Oracle) <harry@kernel.org>
---
Harry Yoo (Oracle) (8):
mm/slab: do not store cache pointer in struct slab_sheaf
mm/slab: change sheaf_capacity type to unsigned short
mm/slab: track capacity per sheaf
mm/slab: allow bootstrap_cache_sheaves() to fail
mm/slab: rework cache_has_sheaves() to check immutable properties only
mm/slab: allow changing sheaf_capacity at runtime
mm/slab: add pcs->lock lockdep assert when accessing the barn
mm/slab: allow changing max_{full,empty}_sheaves at runtime
include/linux/slab.h | 8 +-
mm/slab.h | 40 ++-
mm/slab_common.c | 2 +-
mm/slub.c | 715 ++++++++++++++++++++++++++++++-------------
tools/include/linux/slab.h | 14 +-
tools/testing/shared/linux.c | 4 +-
6 files changed, 563 insertions(+), 220 deletions(-)
---
base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
change-id: 20260515-sheaves-tuning-e1f897dc7f5e
Best regards,
--
Cheers,
Harry / Hyeonggon
next reply other threads:[~2026-05-15 16:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 16:24 Harry Yoo (Oracle) [this message]
2026-05-15 16:24 ` [PATCH RFC 1/8] mm/slab: do not store cache pointer in struct slab_sheaf Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 2/8] mm/slab: change sheaf_capacity type to unsigned short Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 3/8] mm/slab: track capacity per sheaf Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 4/8] mm/slab: allow bootstrap_cache_sheaves() to fail Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 5/8] mm/slab: rework cache_has_sheaves() to check immutable properties only Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 6/8] mm/slab: allow changing sheaf_capacity at runtime Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 7/8] mm/slab: add pcs->lock lockdep assert when accessing the barn Harry Yoo (Oracle)
2026-05-15 16:24 ` [PATCH RFC 8/8] mm/slab: allow changing max_{full,empty}_sheaves at runtime Harry Yoo (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260516-sheaves-tuning-v1-0-221aa3e1d829@kernel.org \
--to=harry@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox