linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, vbabka@suse.cz, andreyknvl@gmail.com,
	cl@linux.com, dvyukov@google.com, glider@google.com,
	hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org,
	muchun.song@linux.dev, rientjes@google.com,
	roman.gushchin@linux.dev, ryabinin.a.a@gmail.com,
	shakeel.butt@linux.dev, vincenzo.frascino@arm.com,
	yeoreum.yun@arm.com, tytso@mit.edu, adilger.kernel@dilger.ca,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH V3 6/7] mm/slab: save memory by allocating slabobj_ext array from leftover
Date: Wed, 29 Oct 2025 16:59:59 +0900	[thread overview]
Message-ID: <aQHJfyoUN-tbnVFr@hyeyoo> (raw)
In-Reply-To: <CAJuCfpGY0h2d6VEAEa4kjH2yUMGDdke_QTFt6d+gb+kH=rnXyQ@mail.gmail.com>

On Tue, Oct 28, 2025 at 08:07:42PM -0700, Suren Baghdasaryan wrote:
> On Mon, Oct 27, 2025 at 5:29 AM Harry Yoo <harry.yoo@oracle.com> wrote:
> >
> > The leftover space in a slab is always smaller than s->size, and
> > kmem caches for large objects that are not power-of-two sizes tend to have
> > a greater amount of leftover space per slab. In some cases, the leftover
> > space is larger than the size of the slabobj_ext array for the slab.
> >
> > An excellent example of such a cache is ext4_inode_cache. On my system,
> > the object size is 1144, with a preferred order of 3, 28 objects per slab,
> > and 736 bytes of leftover space per slab.
> >
> > Since the size of the slabobj_ext array is only 224 bytes (w/o mem
> > profiling) or 448 bytes (w/ mem profiling) per slab, the entire array
> > fits within the leftover space.
> >
> > Allocate the slabobj_exts array from this unused space instead of using
> > kcalloc(), when it is large enough. The array is always allocated when
> > creating new slabs, because implementing lazy allocation correctly is
> > difficult without expensive synchronization.
> >
> > To avoid unnecessary overhead when MEMCG (with SLAB_ACCOUNT) and
> > MEM_ALLOC_PROFILING are not used for the cache, only allocate the
> > slabobj_ext array only when either of them are enabled when slabs are
> > created.
> >
> > [ MEMCG=y, MEM_ALLOC_PROFILING=n ]
> >
> > Before patch (creating 2M directories on ext4):
> >   Slab:            3575348 kB
> >   SReclaimable:    3137804 kB
> >   SUnreclaim:       437544 kB
> >
> > After patch (creating 2M directories on ext4):
> >   Slab:            3558236 kB
> >   SReclaimable:    3139268 kB
> >   SUnreclaim:       418968 kB (-18.14 MiB)
> >
> > Enjoy the memory savings!
> >
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> >  mm/slub.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 142 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 13acc9437ef5..8101df5fdccf 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> > +{
> > +       unsigned long obj_exts;
> > +
> > +       if (!obj_exts_fit_within_slab_leftover(s, slab))
> > +               return false;
> > +
> > +       obj_exts = (unsigned long)slab_address(slab);
> > +       obj_exts += obj_exts_offset_in_slab(s, slab);
> > +       return obj_exts == slab_obj_exts(slab);
> 
> You can check that slab_obj_exts(slab) is not NULL before making the
> above calculations.

Did you mean this?

  if (!slab_obj_exts(slab))
      return false;

If so, yes that makes sense.

> > @@ -2185,6 +2311,11 @@ static inline void free_slab_obj_exts(struct slab *slab)
> >  {
> >  }
> >
> > +static inline void alloc_slab_obj_exts_early(struct kmem_cache *s,
> > +                                                      struct slab *slab)
> > +{
> > +}
> > +
> >  #endif /* CONFIG_SLAB_OBJ_EXT */
> >
> >  #ifdef CONFIG_MEM_ALLOC_PROFILING
> > @@ -3155,7 +3286,9 @@ static inline bool shuffle_freelist(struct kmem_cache *s, struct slab *slab)
> >  static __always_inline void account_slab(struct slab *slab, int order,
> >                                          struct kmem_cache *s, gfp_t gfp)
> >  {
> > -       if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> > +       if (memcg_kmem_online() &&
> > +                       (s->flags & SLAB_ACCOUNT) &&
> > +                       !slab_obj_exts(slab))
> >                 alloc_slab_obj_exts(slab, s, gfp, true);
> 
> Don't you need to add a check for !obj_exts_in_slab() inside
> alloc_slab_obj_exts() to avoid allocating slab->obj_exts?

slab_obj_exts() should have returned a nonzero value
and then we don't call alloc_slab_obj_exts()?

> >         mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s),
> > @@ -3219,9 +3352,6 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >         slab->objects = oo_objects(oo);slab_obj_exts
> >         slab->inuse = 0;
> >         slab->frozen = 0;
> > -       init_slab_obj_exts(slab);
> > -
> > -       account_slab(slab, oo_order(oo), s, flags);
> >
> >         slab->slab_cache = s;
> >
> > @@ -3230,6 +3360,13 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >         start = slab_address(slab);
> >
> >         setup_slab_debug(s, slab, start);
> > +       init_slab_obj_exts(slab);
> > +       /*
> > +        * Poison the slab before initializing the slabobj_ext array
> > +        * to prevent the array from being overwritten.
> > +        */
> > +       alloc_slab_obj_exts_early(s, slab);
> > +       account_slab(slab, oo_order(oo), s, flags);
> 
>  alloc_slab_obj_exts() is called in 2 other places:
> 1. __memcg_slab_post_alloc_hook()
> 2. prepare_slab_obj_exts_hook()
> 
> Don't you need alloc_slab_obj_exts_early() there as well?

That's good point, and I thought it's difficult to address
concurrency problem without using a per-slab lock.

Thread A                    Thread B
- sees slab->obj_exts == 0 
                            - sees slab->obj_exts == 0
			    - allocates the vector from unused space
			      and initializes it.
			    - try cmpxchg()
- allocates the vector
  from unused space and
  initializes it.
  (the vector is already
   in use and it's overwritten!)

- try cmpxchg()

But since this is slowpath, using slab_{lock,unlock}() here is probably
fine. What do you think?

-- 
Cheers,
Harry / Hyeonggon

  reply	other threads:[~2025-10-29  8:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 12:28 [RFC PATCH V3 0/7] mm/slab: reduce slab accounting memory overhead by allocating slabobj_ext metadata within unused slab space Harry Yoo
2025-10-27 12:28 ` [RFC PATCH V3 1/7] mm/slab: allow specifying freepointer offset when using constructor Harry Yoo
2025-10-28 17:43   ` Suren Baghdasaryan
2025-10-29  7:10     ` Harry Yoo
2025-10-30 14:35       ` Vlastimil Babka
2025-10-27 12:28 ` [RFC PATCH V3 2/7] ext4: specify the free pointer offset for ext4_inode_cache Harry Yoo
2025-10-28 17:22   ` Suren Baghdasaryan
2025-10-28 17:25     ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 3/7] mm/slab: abstract slabobj_ext access via new slab_obj_ext() helper Harry Yoo
2025-10-28 17:55   ` Suren Baghdasaryan
2025-10-29  8:49     ` Harry Yoo
2025-10-29 15:24       ` Suren Baghdasaryan
2025-10-30  1:26         ` Harry Yoo
2025-10-30  5:03           ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 4/7] mm/slab: use stride to access slabobj_ext Harry Yoo
2025-10-28 20:10   ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 5/7] mm/memcontrol,alloc_tag: handle slabobj_ext access under KASAN poison Harry Yoo
2025-10-28 23:03   ` Suren Baghdasaryan
2025-10-29  8:06     ` Harry Yoo
2025-10-29 15:28       ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 6/7] mm/slab: save memory by allocating slabobj_ext array from leftover Harry Yoo
2025-10-29  3:07   ` Suren Baghdasaryan
2025-10-29  7:59     ` Harry Yoo [this message]
2025-10-29 18:37       ` Suren Baghdasaryan
2025-10-30  0:40         ` Harry Yoo
2025-10-30 16:33           ` Vlastimil Babka
2025-10-29 18:45   ` Andrey Ryabinin
2025-10-30  1:11     ` Harry Yoo
2025-10-27 12:28 ` [RFC PATCH V3 7/7] mm/slab: place slabobj_ext metadata in unused space within s->size Harry Yoo
2025-10-29  3:19   ` Suren Baghdasaryan
2025-10-29 18:19   ` Andrey Ryabinin
2025-10-30  0:51     ` Harry Yoo
2025-10-30 12:41       ` Yeoreum Yun
2025-10-30 16:39 ` [RFC PATCH V3 0/7] mm/slab: reduce slab accounting memory overhead by allocating slabobj_ext metadata within unused slab space Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQHJfyoUN-tbnVFr@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=vincenzo.frascino@arm.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).