Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Al Viro <viro@zeniv.linux.org.uk>, linux-mm@kvack.org
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Harry Yoo <harry.yoo@oracle.com>,
	linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Mateusz Guzik <mjguzik@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v3 00/10] kmem_cache instances with static storage duration
Date: Tue, 23 Jun 2026 10:09:41 +0200	[thread overview]
Message-ID: <151b1c96-0ed5-44ce-b9d5-86a48d8ad592@kernel.org> (raw)
In-Reply-To: <20260613050951.855141-1-viro@zeniv.linux.org.uk>

On 6/13/26 07:09, Al Viro wrote:
> Changes since v2:
> 	* fixed the braindamage in /sys/kernel/slab (any statically allocated
> caches should just use their name for subdirectory, whether they are mergable
> or not)
> 	* infrastructure bits slightly reordered and carved in hopefully saner
> way.
> 	* rebased to 7.1-rc7.
> 
> Changes since v1:
> 	* milder restrictions on mergability (non-modular static cache
> can be used as alias for later dynamic ones)
> 	* consolidated conversions into fewer commits
> 	* rebased to 7.1-rc6.
> 
> Branch lives in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.kmem_cache_static
> individual patches in followups.
> 
> Please, review.  It appears to work; the impact on merging is very small -
> on the testbox here (251 distinct aliases) mainline gets 140 kmem_cache
> instances and this series gets 142 - fs_cachep and names_cache get split
> from the group of 192-byte caches.  Considering the contents and use patterns
> for those two... we might be better off that way, actually.
> 
>         kmem_cache_create() and friends create new instances of
> struct kmem_cache and return pointers to those.  Quite a few things in
> core kernel are allocated from such caches; each allocation involves
> dereferencing an assign-once pointer and for sufficiently hot ones that
> dereferencing does show in profiles.
> 
>         There had been patches floating around switching some of those
> to runtime_const infrastructure.  Unfortunately, it's arch-specific
> and most of the architectures lack it.

There's only a handful of architectures that would be considered performance
critical enough for us to care, right? So the question is which ones of
those lack it and would the effort to make them support it be smaller than
doing this? Also this covers slab only, but I'd expect runtime_const() would
have use cases also in other subsystems that wouldn't need
subsystem-specific changes?

>         There's an alternative approach applicable at least to the caches
> that are never destroyed, which covers a lot of them.  No matter what,
> runtime_const for pointers is not going to be faster than plain &,
> so if we had struct kmem_cache instances with static storage duration, we
> would be at least no worse off than we are with runtime_const variants.

But the argument for doing the static duration support is that it should be
faster, not just "not slower"? So is runtime_const equivalent or for some
fundamental reason it's slower than plain &?

Or is the advantage that static caches can support modules and runtime_const
can't?

>         There are obstacles to doing that, but they turn out to be easy
> to deal with.
> 
> 1) as it is, struct kmem_cache is opaque for anything outside of a few
> files in mm/*; that avoids serious headache with header dependencies,
> etc., and it's not something we want to lose.  Solution: struct
> kmem_cache_opaque, with the size and alignment identical to struct
> kmem_cache.  Calculation of size and alignment can be done via the same
> mechanism we use for asm-offsets.h and rq-offsets.h, with build-time
> check for mismatches.  With that done, we get an opaque type defined in
> linux/slab-static.h that can be used for declaring those caches.
> In linux/slab.h we add a forward declaration of kmem_cache_opaque +
> helper (to_kmem_cache()) converting a pointer to kmem_cache_opaque
> into pointer to kmem_cache.
> 
> 2) real constructor of kmem_cache needs to be taught to deal with
> preallocated instances.  That turns out to be easy - we already pass an
> obscene amount of optional arguments via struct kmem_cache_args, so we
> can stash the pointer to preallocated instance in there.  Changes in
> mm/slab_common.c are very minor - we should not merge preallocated caches
> with anything already created, use the instance passed to us instead of
> allocating a new one and we should not free them.  That's it.
> 
> 	A set of helpers parallel to kmem_cache_create() and friends
> (kmem_cache_setup(), etc.) is provided in the same linux/slab-static.h;
> generally, conversion affects only a few lines.
> 
> 	Note that slab-static.h is needed only in places that create
> such instances; all users need only slab.h (and they can be modular,
> unlike runtime_const-based approach).
> 
> 
> 	That covers the instances that never get destroyed.  Quite a few
> fall into that category, but there's a major exception - anything in
> modules must be destroyed before the module gets removed.  Note that
> unlike runtime_constant-based approach, cache _uses_ in a module are
> fine - if kmem_cache_opaque instance is exported, its address is available
> to modules without any problems.  It's caches _created_ in a module

So are there uses in modules of caches created in built-in code, which
therefore could not be changed to runtime_const?

> that offer an extra twist.
> 
> 	Teaching kmem_cache_destroy() to skip actual freeing of given
> kmem_cache instance is trivial; the problem is that kmem_cache_destroy()
> may overlap with sysfs access to attributes of that cache.  In that
> case kmem_cache_destroy() may return before the instance gets freed -
> freeing (from slab_kmem_cache_release()) happens when the refcount of
> embedded kobject drops to zero.  That's fine, since all references
> to data structures in module's memory are already gone by the time
> kmem_cache_destroy() returns.  That, however, relies upon the struct
> kmem_cache itself not being in module's memory; getting it unmapped
> before slab_kmem_cache_release() has run needs to be avoided.
> 
> 	It's not hard to deal with, though.  We need to make sure that
> instance in a module will get to slab_kmem_cache_release() before the
> module data gets freed.  That's only a problem on sysfs setups -
> otherwise it'll definitely be finished before kmem_cache_destroy()
> returns.
> 
> 	Note that modules themselves have sysfs-exposed attributes,
> so a similar problem already exists there.  That's dealt with by
> having mod_sysfs_teardown() wait for refcount of module->mkobj.kobj
> reaching zero.  Let's make use of that - have static-duration-in-module
> kmem_cache instances grab a reference to that kobject upon setup and
> drop it in the end of slab_kmem_cache_release().
> 
> 	Let setup helpers store the kobject to be pinned in
> kmem_cache_args->owner (for preallocated; if somebody manually sets it
> for non-preallocated case, it'll be ignored).  That would be
> &THIS_MODULE->mkobj.kobj for a module and NULL in built-in.
> 
> 	If sysfs is enabled and we are dealing with preallocated instance,
> let create_cache() grab and stash that reference in kmem_cache->owner
> and let slab_kmem_cache_release() drop it instead of freeing kmem_cache
> instance.
> 
> 
> 	Costs:
> * a bit (SLAB_PREALLOCATED) is stolen from slab_flags_t
> * such caches can't be merged with anything preexisting (obviously)
> and subsequent cache creations can't merge with static-in-module
> ones.  If you want them more mergable, don't use that technics.
> * you can't do kmem_cache_setup()/kmem_cache_destroy()/kmem_cache_setup()
> on the same instance.  Just don't do that.
> 
> Al Viro (9):
>   static kmem_cache instances for core caches
>   allow static-duration kmem_cache in modules
>   VFS caches: switch from runtime_const() machinery to slab-static.h
>   make inode_cache statically allocated
>   make mnt_cache statically allocated
>   make bh_cachep statically allocated
>   make seq_file_cache statically allocated
>   make thread component caches (fs_cachep, files_cachep, etc.)
>     statically allocated
>   make ufs_inode_cache statically allocated
> 
>  Kbuild                            | 14 ++++++-
>  fs/buffer.c                       |  6 ++-
>  fs/dcache.c                       |  8 ++--
>  fs/file_table.c                   | 40 ++++++++----------
>  fs/inode.c                        |  6 ++-
>  fs/namei.c                        | 16 +++----
>  fs/namespace.c                    |  6 ++-
>  fs/seq_file.c                     |  6 ++-
>  fs/ufs/super.c                    |  9 ++--
>  include/asm-generic/vmlinux.lds.h |  6 +--
>  include/linux/fdtable.h           |  3 +-
>  include/linux/fs_struct.h         |  3 +-
>  include/linux/signal.h            |  3 +-
>  include/linux/slab-static.h       | 69 +++++++++++++++++++++++++++++++
>  include/linux/slab.h              | 11 +++++
>  kernel/fork.c                     | 37 ++++++++++-------
>  mm/kmem_cache_size.c              | 20 +++++++++
>  mm/slab.h                         |  1 +
>  mm/slab_common.c                  | 49 ++++++++++++++--------
>  mm/slub.c                         |  7 ++++
>  20 files changed, 231 insertions(+), 89 deletions(-)
>  create mode 100644 include/linux/slab-static.h
>  create mode 100644 mm/kmem_cache_size.c
> 
> --
> 2.47.3
> 
> 
> 
> Al Viro (10):
>   static kmem_cache instances for core caches: infrastructure
>   static kmem_cache instances for core caches: setup primitives
>   allow preallocated kmem_cache instances in modules
>   VFS caches: switch from runtime_const() machinery to slab-static.h
>   make inode_cache statically allocated
>   make mnt_cache statically allocated
>   make bh_cachep statically allocated
>   make seq_file_cache statically allocated
>   make thread component caches (fs_cachep, files_cachep, etc.)
>     statically allocated
>   make ufs_inode_cache statically allocated
> 
>  Kbuild                            | 14 ++++++-
>  fs/buffer.c                       |  6 ++-
>  fs/dcache.c                       |  8 ++--
>  fs/file_table.c                   | 40 ++++++++----------
>  fs/inode.c                        |  6 ++-
>  fs/namei.c                        | 16 +++----
>  fs/namespace.c                    |  6 ++-
>  fs/seq_file.c                     |  6 ++-
>  fs/ufs/super.c                    |  9 ++--
>  include/asm-generic/vmlinux.lds.h |  6 +--
>  include/linux/fdtable.h           |  3 +-
>  include/linux/fs_struct.h         |  3 +-
>  include/linux/signal.h            |  3 +-
>  include/linux/slab-static.h       | 69 +++++++++++++++++++++++++++++++
>  include/linux/slab.h              | 11 +++++
>  kernel/fork.c                     | 37 ++++++++++-------
>  mm/kmem_cache_size.c              | 20 +++++++++
>  mm/slab.h                         |  1 +
>  mm/slab_common.c                  | 50 +++++++++++++++-------
>  mm/slub.c                         | 27 +++++++-----
>  20 files changed, 242 insertions(+), 99 deletions(-)
>  create mode 100644 include/linux/slab-static.h
>  create mode 100644 mm/kmem_cache_size.c
> 



  parent reply	other threads:[~2026-06-23  8:09 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-10  4:02 [RFC PATCH 00/15] kmem_cache instances with static storage duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 01/15] static kmem_cache instances for core caches Al Viro
2026-01-10  5:40   ` Matthew Wilcox
2026-01-10  6:23     ` Al Viro
2026-01-14  7:30   ` Harry Yoo
2026-01-14  7:38     ` Al Viro
2026-01-15 16:59     ` Vlastimil Babka
2026-06-11 17:13       ` Al Viro
2026-06-11 17:14         ` [PATCH v2 0/9] kmem_cache instances with static storage duration Al Viro
2026-06-11 17:14           ` [PATCH v2 1/9] static kmem_cache instances for core caches Al Viro
2026-06-11 17:14           ` [PATCH v2 2/9] allow static-duration kmem_cache in modules Al Viro
2026-06-11 17:14           ` [PATCH v2 3/9] VFS caches: switch from runtime_const() machinery to slab-static.h Al Viro
2026-06-11 17:14           ` [PATCH v2 4/9] make inode_cache statically allocated Al Viro
2026-06-11 17:14           ` [PATCH v2 5/9] make mnt_cache " Al Viro
2026-06-11 17:14           ` [PATCH v2 6/9] make bh_cachep " Al Viro
2026-06-11 17:14           ` [PATCH v2 7/9] make seq_file_cache " Al Viro
2026-06-11 17:14           ` [PATCH v2 8/9] make thread component caches (fs_cachep, files_cachep, etc.) " Al Viro
2026-06-11 17:14           ` [PATCH v2 9/9] make ufs_inode_cache " Al Viro
2026-06-13  5:09           ` [RFC PATCH v3 00/10] kmem_cache instances with static storage duration Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 01/10] static kmem_cache instances for core caches: infrastructure Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 02/10] static kmem_cache instances for core caches: setup primitives Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 03/10] allow preallocated kmem_cache instances in modules Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 04/10] VFS caches: switch from runtime_const() machinery to slab-static.h Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 05/10] make inode_cache statically allocated Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 06/10] make mnt_cache " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 07/10] make bh_cachep " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 08/10] make seq_file_cache " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 09/10] make thread component caches (fs_cachep, files_cachep, etc.) " Al Viro
2026-06-13  5:09             ` [RFC PATCH v3 10/10] make ufs_inode_cache " Al Viro
2026-06-23  8:09             ` Vlastimil Babka (SUSE) [this message]
2026-06-24  0:48               ` [RFC PATCH v3 00/10] kmem_cache instances with static storage duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 02/15] allow static-duration kmem_cache in modules Al Viro
2026-01-10  4:02 ` [RFC PATCH 03/15] make mnt_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 04/15] turn thread_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 05/15] turn signal_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 06/15] turn bh_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 07/15] turn dentry_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 08/15] turn files_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 09/15] make filp and bfilp caches static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 10/15] turn sighand_cache static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 11/15] turn mm_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 12/15] turn task_struct_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 13/15] turn fs_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 14/15] turn inode_cachep static-duration Al Viro
2026-01-10  4:02 ` [RFC PATCH 15/15] turn ufs_inode_cache static-duration Al Viro
2026-01-10  5:33 ` [RFC PATCH 00/15] kmem_cache instances with static storage duration Linus Torvalds
2026-01-10  6:16   ` Al Viro
2026-01-14  7:12     ` Harry Yoo
2026-01-15  0:46 ` Christoph Lameter (Ampere)
2026-01-15  2:08   ` Al Viro
2026-01-15 19:10     ` Christoph Lameter (Ampere)
2026-01-15 19:44       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=151b1c96-0ed5-44ce-b9d5-86a48d8ad592@kernel.org \
    --to=vbabka@kernel.org \
    --cc=brauner@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mjguzik@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox