All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, vbabka@suse.cz, andreyknvl@gmail.com,
	cl@linux.com, dvyukov@google.com, glider@google.com,
	hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org,
	muchun.song@linux.dev, rientjes@google.com,
	roman.gushchin@linux.dev, ryabinin.a.a@gmail.com,
	shakeel.butt@linux.dev, vincenzo.frascino@arm.com,
	yeoreum.yun@arm.com, tytso@mit.edu, adilger.kernel@dilger.ca,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH V3 6/7] mm/slab: save memory by allocating slabobj_ext array from leftover
Date: Wed, 29 Oct 2025 16:59:59 +0900	[thread overview]
Message-ID: <aQHJfyoUN-tbnVFr@hyeyoo> (raw)
In-Reply-To: <CAJuCfpGY0h2d6VEAEa4kjH2yUMGDdke_QTFt6d+gb+kH=rnXyQ@mail.gmail.com>

On Tue, Oct 28, 2025 at 08:07:42PM -0700, Suren Baghdasaryan wrote:
> On Mon, Oct 27, 2025 at 5:29 AM Harry Yoo <harry.yoo@oracle.com> wrote:
> >
> > The leftover space in a slab is always smaller than s->size, and
> > kmem caches for large objects that are not power-of-two sizes tend to have
> > a greater amount of leftover space per slab. In some cases, the leftover
> > space is larger than the size of the slabobj_ext array for the slab.
> >
> > An excellent example of such a cache is ext4_inode_cache. On my system,
> > the object size is 1144, with a preferred order of 3, 28 objects per slab,
> > and 736 bytes of leftover space per slab.
> >
> > Since the size of the slabobj_ext array is only 224 bytes (w/o mem
> > profiling) or 448 bytes (w/ mem profiling) per slab, the entire array
> > fits within the leftover space.
> >
> > Allocate the slabobj_exts array from this unused space instead of using
> > kcalloc(), when it is large enough. The array is always allocated when
> > creating new slabs, because implementing lazy allocation correctly is
> > difficult without expensive synchronization.
> >
> > To avoid unnecessary overhead when MEMCG (with SLAB_ACCOUNT) and
> > MEM_ALLOC_PROFILING are not used for the cache, only allocate the
> > slabobj_ext array only when either of them are enabled when slabs are
> > created.
> >
> > [ MEMCG=y, MEM_ALLOC_PROFILING=n ]
> >
> > Before patch (creating 2M directories on ext4):
> >   Slab:            3575348 kB
> >   SReclaimable:    3137804 kB
> >   SUnreclaim:       437544 kB
> >
> > After patch (creating 2M directories on ext4):
> >   Slab:            3558236 kB
> >   SReclaimable:    3139268 kB
> >   SUnreclaim:       418968 kB (-18.14 MiB)
> >
> > Enjoy the memory savings!
> >
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> >  mm/slub.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 142 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 13acc9437ef5..8101df5fdccf 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> > +{
> > +       unsigned long obj_exts;
> > +
> > +       if (!obj_exts_fit_within_slab_leftover(s, slab))
> > +               return false;
> > +
> > +       obj_exts = (unsigned long)slab_address(slab);
> > +       obj_exts += obj_exts_offset_in_slab(s, slab);
> > +       return obj_exts == slab_obj_exts(slab);
> 
> You can check that slab_obj_exts(slab) is not NULL before making the
> above calculations.

Did you mean this?

  if (!slab_obj_exts(slab))
      return false;

If so, yes that makes sense.

> > @@ -2185,6 +2311,11 @@ static inline void free_slab_obj_exts(struct slab *slab)
> >  {
> >  }
> >
> > +static inline void alloc_slab_obj_exts_early(struct kmem_cache *s,
> > +                                                      struct slab *slab)
> > +{
> > +}
> > +
> >  #endif /* CONFIG_SLAB_OBJ_EXT */
> >
> >  #ifdef CONFIG_MEM_ALLOC_PROFILING
> > @@ -3155,7 +3286,9 @@ static inline bool shuffle_freelist(struct kmem_cache *s, struct slab *slab)
> >  static __always_inline void account_slab(struct slab *slab, int order,
> >                                          struct kmem_cache *s, gfp_t gfp)
> >  {
> > -       if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> > +       if (memcg_kmem_online() &&
> > +                       (s->flags & SLAB_ACCOUNT) &&
> > +                       !slab_obj_exts(slab))
> >                 alloc_slab_obj_exts(slab, s, gfp, true);
> 
> Don't you need to add a check for !obj_exts_in_slab() inside
> alloc_slab_obj_exts() to avoid allocating slab->obj_exts?

slab_obj_exts() should have returned a nonzero value
and then we don't call alloc_slab_obj_exts()?

> >         mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s),
> > @@ -3219,9 +3352,6 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >         slab->objects = oo_objects(oo);slab_obj_exts
> >         slab->inuse = 0;
> >         slab->frozen = 0;
> > -       init_slab_obj_exts(slab);
> > -
> > -       account_slab(slab, oo_order(oo), s, flags);
> >
> >         slab->slab_cache = s;
> >
> > @@ -3230,6 +3360,13 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >         start = slab_address(slab);
> >
> >         setup_slab_debug(s, slab, start);
> > +       init_slab_obj_exts(slab);
> > +       /*
> > +        * Poison the slab before initializing the slabobj_ext array
> > +        * to prevent the array from being overwritten.
> > +        */
> > +       alloc_slab_obj_exts_early(s, slab);
> > +       account_slab(slab, oo_order(oo), s, flags);
> 
>  alloc_slab_obj_exts() is called in 2 other places:
> 1. __memcg_slab_post_alloc_hook()
> 2. prepare_slab_obj_exts_hook()
> 
> Don't you need alloc_slab_obj_exts_early() there as well?

That's good point, and I thought it's difficult to address
concurrency problem without using a per-slab lock.

Thread A                    Thread B
- sees slab->obj_exts == 0 
                            - sees slab->obj_exts == 0
			    - allocates the vector from unused space
			      and initializes it.
			    - try cmpxchg()
- allocates the vector
  from unused space and
  initializes it.
  (the vector is already
   in use and it's overwritten!)

- try cmpxchg()

But since this is slowpath, using slab_{lock,unlock}() here is probably
fine. What do you think?

-- 
Cheers,
Harry / Hyeonggon

  reply	other threads:[~2025-10-29  8:00 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 12:28 [RFC PATCH V3 0/7] mm/slab: reduce slab accounting memory overhead by allocating slabobj_ext metadata within unused slab space Harry Yoo
2025-10-27 12:28 ` [RFC PATCH V3 1/7] mm/slab: allow specifying freepointer offset when using constructor Harry Yoo
2025-10-28 17:43   ` Suren Baghdasaryan
2025-10-29  7:10     ` Harry Yoo
2025-10-30 14:35       ` Vlastimil Babka
2025-10-27 12:28 ` [RFC PATCH V3 2/7] ext4: specify the free pointer offset for ext4_inode_cache Harry Yoo
2025-10-28 17:22   ` Suren Baghdasaryan
2025-10-28 17:25     ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 3/7] mm/slab: abstract slabobj_ext access via new slab_obj_ext() helper Harry Yoo
2025-10-28 17:55   ` Suren Baghdasaryan
2025-10-29  8:49     ` Harry Yoo
2025-10-29 15:24       ` Suren Baghdasaryan
2025-10-30  1:26         ` Harry Yoo
2025-10-30  5:03           ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 4/7] mm/slab: use stride to access slabobj_ext Harry Yoo
2025-10-28 20:10   ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 5/7] mm/memcontrol,alloc_tag: handle slabobj_ext access under KASAN poison Harry Yoo
2025-10-28 23:03   ` Suren Baghdasaryan
2025-10-29  8:06     ` Harry Yoo
2025-10-29 15:28       ` Suren Baghdasaryan
2025-10-27 12:28 ` [RFC PATCH V3 6/7] mm/slab: save memory by allocating slabobj_ext array from leftover Harry Yoo
2025-10-29  3:07   ` Suren Baghdasaryan
2025-10-29  7:59     ` Harry Yoo [this message]
2025-10-29 18:37       ` Suren Baghdasaryan
2025-10-30  0:40         ` Harry Yoo
2025-10-30 16:33           ` Vlastimil Babka
2025-10-29 18:45   ` Andrey Ryabinin
2025-10-30  1:11     ` Harry Yoo
2025-10-27 12:28 ` [RFC PATCH V3 7/7] mm/slab: place slabobj_ext metadata in unused space within s->size Harry Yoo
2025-10-29  3:19   ` Suren Baghdasaryan
2025-12-22 11:14     ` Harry Yoo
2025-10-29 18:19   ` Andrey Ryabinin
2025-10-30  0:51     ` Harry Yoo
2025-10-30 12:41       ` Yeoreum Yun
2025-10-30 16:39 ` [RFC PATCH V3 0/7] mm/slab: reduce slab accounting memory overhead by allocating slabobj_ext metadata within unused slab space Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQHJfyoUN-tbnVFr@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=vincenzo.frascino@arm.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.