Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: linux-mm@kvack.org
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Harry Yoo <harry.yoo@oracle.com>,
	linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Mateusz Guzik <mjguzik@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: [RFC PATCH v3 00/10] kmem_cache instances with static storage duration
Date: Sat, 13 Jun 2026 06:09:41 +0100	[thread overview]
Message-ID: <20260613050951.855141-1-viro@zeniv.linux.org.uk> (raw)
In-Reply-To: <20260611171425.1671254-1-viro@zeniv.linux.org.uk>

Changes since v2:
	* fixed the braindamage in /sys/kernel/slab (any statically allocated
caches should just use their name for subdirectory, whether they are mergable
or not)
	* infrastructure bits slightly reordered and carved in hopefully saner
way.
	* rebased to 7.1-rc7.

Changes since v1:
	* milder restrictions on mergability (non-modular static cache
can be used as alias for later dynamic ones)
	* consolidated conversions into fewer commits
	* rebased to 7.1-rc6.

Branch lives in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.kmem_cache_static
individual patches in followups.

Please, review.  It appears to work; the impact on merging is very small -
on the testbox here (251 distinct aliases) mainline gets 140 kmem_cache
instances and this series gets 142 - fs_cachep and names_cache get split
from the group of 192-byte caches.  Considering the contents and use patterns
for those two... we might be better off that way, actually.

        kmem_cache_create() and friends create new instances of
struct kmem_cache and return pointers to those.  Quite a few things in
core kernel are allocated from such caches; each allocation involves
dereferencing an assign-once pointer and for sufficiently hot ones that
dereferencing does show in profiles.

        There had been patches floating around switching some of those
to runtime_const infrastructure.  Unfortunately, it's arch-specific
and most of the architectures lack it.

        There's an alternative approach applicable at least to the caches
that are never destroyed, which covers a lot of them.  No matter what,
runtime_const for pointers is not going to be faster than plain &,
so if we had struct kmem_cache instances with static storage duration, we
would be at least no worse off than we are with runtime_const variants.

        There are obstacles to doing that, but they turn out to be easy
to deal with.

1) as it is, struct kmem_cache is opaque for anything outside of a few
files in mm/*; that avoids serious headache with header dependencies,
etc., and it's not something we want to lose.  Solution: struct
kmem_cache_opaque, with the size and alignment identical to struct
kmem_cache.  Calculation of size and alignment can be done via the same
mechanism we use for asm-offsets.h and rq-offsets.h, with build-time
check for mismatches.  With that done, we get an opaque type defined in
linux/slab-static.h that can be used for declaring those caches.
In linux/slab.h we add a forward declaration of kmem_cache_opaque +
helper (to_kmem_cache()) converting a pointer to kmem_cache_opaque
into pointer to kmem_cache.

2) real constructor of kmem_cache needs to be taught to deal with
preallocated instances.  That turns out to be easy - we already pass an
obscene amount of optional arguments via struct kmem_cache_args, so we
can stash the pointer to preallocated instance in there.  Changes in
mm/slab_common.c are very minor - we should not merge preallocated caches
with anything already created, use the instance passed to us instead of
allocating a new one and we should not free them.  That's it.

	A set of helpers parallel to kmem_cache_create() and friends
(kmem_cache_setup(), etc.) is provided in the same linux/slab-static.h;
generally, conversion affects only a few lines.

	Note that slab-static.h is needed only in places that create
such instances; all users need only slab.h (and they can be modular,
unlike runtime_const-based approach).


	That covers the instances that never get destroyed.  Quite a few
fall into that category, but there's a major exception - anything in
modules must be destroyed before the module gets removed.  Note that
unlike runtime_constant-based approach, cache _uses_ in a module are
fine - if kmem_cache_opaque instance is exported, its address is available
to modules without any problems.  It's caches _created_ in a module
that offer an extra twist.

	Teaching kmem_cache_destroy() to skip actual freeing of given
kmem_cache instance is trivial; the problem is that kmem_cache_destroy()
may overlap with sysfs access to attributes of that cache.  In that
case kmem_cache_destroy() may return before the instance gets freed -
freeing (from slab_kmem_cache_release()) happens when the refcount of
embedded kobject drops to zero.  That's fine, since all references
to data structures in module's memory are already gone by the time
kmem_cache_destroy() returns.  That, however, relies upon the struct
kmem_cache itself not being in module's memory; getting it unmapped
before slab_kmem_cache_release() has run needs to be avoided.

	It's not hard to deal with, though.  We need to make sure that
instance in a module will get to slab_kmem_cache_release() before the
module data gets freed.  That's only a problem on sysfs setups -
otherwise it'll definitely be finished before kmem_cache_destroy()
returns.

	Note that modules themselves have sysfs-exposed attributes,
so a similar problem already exists there.  That's dealt with by
having mod_sysfs_teardown() wait for refcount of module->mkobj.kobj
reaching zero.  Let's make use of that - have static-duration-in-module
kmem_cache instances grab a reference to that kobject upon setup and
drop it in the end of slab_kmem_cache_release().

	Let setup helpers store the kobject to be pinned in
kmem_cache_args->owner (for preallocated; if somebody manually sets it
for non-preallocated case, it'll be ignored).  That would be
&THIS_MODULE->mkobj.kobj for a module and NULL in built-in.

	If sysfs is enabled and we are dealing with preallocated instance,
let create_cache() grab and stash that reference in kmem_cache->owner
and let slab_kmem_cache_release() drop it instead of freeing kmem_cache
instance.


	Costs:
* a bit (SLAB_PREALLOCATED) is stolen from slab_flags_t
* such caches can't be merged with anything preexisting (obviously)
and subsequent cache creations can't merge with static-in-module
ones.  If you want them more mergable, don't use that technics.
* you can't do kmem_cache_setup()/kmem_cache_destroy()/kmem_cache_setup()
on the same instance.  Just don't do that.

Al Viro (9):
  static kmem_cache instances for core caches
  allow static-duration kmem_cache in modules
  VFS caches: switch from runtime_const() machinery to slab-static.h
  make inode_cache statically allocated
  make mnt_cache statically allocated
  make bh_cachep statically allocated
  make seq_file_cache statically allocated
  make thread component caches (fs_cachep, files_cachep, etc.)
    statically allocated
  make ufs_inode_cache statically allocated

 Kbuild                            | 14 ++++++-
 fs/buffer.c                       |  6 ++-
 fs/dcache.c                       |  8 ++--
 fs/file_table.c                   | 40 ++++++++----------
 fs/inode.c                        |  6 ++-
 fs/namei.c                        | 16 +++----
 fs/namespace.c                    |  6 ++-
 fs/seq_file.c                     |  6 ++-
 fs/ufs/super.c                    |  9 ++--
 include/asm-generic/vmlinux.lds.h |  6 +--
 include/linux/fdtable.h           |  3 +-
 include/linux/fs_struct.h         |  3 +-
 include/linux/signal.h            |  3 +-
 include/linux/slab-static.h       | 69 +++++++++++++++++++++++++++++++
 include/linux/slab.h              | 11 +++++
 kernel/fork.c                     | 37 ++++++++++-------
 mm/kmem_cache_size.c              | 20 +++++++++
 mm/slab.h                         |  1 +
 mm/slab_common.c                  | 49 ++++++++++++++--------
 mm/slub.c                         |  7 ++++
 20 files changed, 231 insertions(+), 89 deletions(-)
 create mode 100644 include/linux/slab-static.h
 create mode 100644 mm/kmem_cache_size.c

--
2.47.3



Al Viro (10):
  static kmem_cache instances for core caches: infrastructure
  static kmem_cache instances for core caches: setup primitives
  allow preallocated kmem_cache instances in modules
  VFS caches: switch from runtime_const() machinery to slab-static.h
  make inode_cache statically allocated
  make mnt_cache statically allocated
  make bh_cachep statically allocated
  make seq_file_cache statically allocated
  make thread component caches (fs_cachep, files_cachep, etc.)
    statically allocated
  make ufs_inode_cache statically allocated

 Kbuild                            | 14 ++++++-
 fs/buffer.c                       |  6 ++-
 fs/dcache.c                       |  8 ++--
 fs/file_table.c                   | 40 ++++++++----------
 fs/inode.c                        |  6 ++-
 fs/namei.c                        | 16 +++----
 fs/namespace.c                    |  6 ++-
 fs/seq_file.c                     |  6 ++-
 fs/ufs/super.c                    |  9 ++--
 include/asm-generic/vmlinux.lds.h |  6 +--
 include/linux/fdtable.h           |  3 +-
 include/linux/fs_struct.h         |  3 +-
 include/linux/signal.h            |  3 +-
 include/linux/slab-static.h       | 69 +++++++++++++++++++++++++++++++
 include/linux/slab.h              | 11 +++++
 kernel/fork.c                     | 37 ++++++++++-------
 mm/kmem_cache_size.c              | 20 +++++++++
 mm/slab.h                         |  1 +
 mm/slab_common.c                  | 50 +++++++++++++++-------
 mm/slub.c                         | 27 +++++++-----
 20 files changed, 242 insertions(+), 99 deletions(-)
 create mode 100644 include/linux/slab-static.h
 create mode 100644 mm/kmem_cache_size.c

-- 
2.47.3



  parent reply	other threads:[~2026-06-13  5:10 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-10  4:02 [RFC PATCH 00/15] kmem_cache instances with static storage duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 01/15] static kmem_cache instances for core caches Al Viro
2026-01-10  5:40   ` Matthew Wilcox
2026-01-10  6:23     ` Al Viro
2026-01-14  7:30   ` Harry Yoo
2026-01-14  7:38     ` Al Viro
2026-01-15 16:59     ` Vlastimil Babka
2026-06-11 17:13       ` Al Viro
2026-06-11 17:14         ` [PATCH v2 0/9] kmem_cache instances with static storage duration Al Viro
2026-06-11 17:14           ` [PATCH v2 1/9] static kmem_cache instances for core caches Al Viro
2026-06-11 17:14           ` [PATCH v2 2/9] allow static-duration kmem_cache in modules Al Viro
2026-06-11 17:14           ` [PATCH v2 3/9] VFS caches: switch from runtime_const() machinery to slab-static.h Al Viro
2026-06-11 17:14           ` [PATCH v2 4/9] make inode_cache statically allocated Al Viro
2026-06-11 17:14           ` [PATCH v2 5/9] make mnt_cache " Al Viro
2026-06-11 17:14           ` [PATCH v2 6/9] make bh_cachep " Al Viro
2026-06-11 17:14           ` [PATCH v2 7/9] make seq_file_cache " Al Viro
2026-06-11 17:14           ` [PATCH v2 8/9] make thread component caches (fs_cachep, files_cachep, etc.) " Al Viro
2026-06-11 17:14           ` [PATCH v2 9/9] make ufs_inode_cache " Al Viro
2026-06-13  5:09           ` Al Viro [this message]
2026-06-13  5:09             ` [RFC PATCH v3 01/10] static kmem_cache instances for core caches: infrastructure Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 02/10] static kmem_cache instances for core caches: setup primitives Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 03/10] allow preallocated kmem_cache instances in modules Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 04/10] VFS caches: switch from runtime_const() machinery to slab-static.h Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 05/10] make inode_cache statically allocated Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 06/10] make mnt_cache " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 07/10] make bh_cachep " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 08/10] make seq_file_cache " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 09/10] make thread component caches (fs_cachep, files_cachep, etc.) " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 10/10] make ufs_inode_cache " Al Viro
2026-06-23  8:09             ` [RFC PATCH v3 00/10] kmem_cache instances with static storage duration Vlastimil Babka (SUSE)
2026-06-24  0:48               ` Al Viro
2026-01-10  4:02 ` [RFC PATCH 02/15] allow static-duration kmem_cache in modules Al Viro
2026-01-10  4:02 ` [RFC PATCH 03/15] make mnt_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 04/15] turn thread_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 05/15] turn signal_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 06/15] turn bh_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 07/15] turn dentry_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 08/15] turn files_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 09/15] make filp and bfilp caches static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 10/15] turn sighand_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 11/15] turn mm_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 12/15] turn task_struct_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 13/15] turn fs_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 14/15] turn inode_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 15/15] turn ufs_inode_cache static-duration Al Viro
2026-01-10  5:33 ` [RFC PATCH 00/15] kmem_cache instances with static storage duration Linus Torvalds
2026-01-10  6:16   ` Al Viro
2026-01-14  7:12     ` Harry Yoo
2026-01-15  0:46 ` Christoph Lameter (Ampere)
2026-01-15  2:08   ` Al Viro
2026-01-15 19:10     ` Christoph Lameter (Ampere)
2026-01-15 19:44       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613050951.855141-1-viro@zeniv.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=brauner@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mjguzik@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox