From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: Hao Li <hao.li@linux.dev>
Cc: hu.shengming@zte.com.cn, vbabka@kernel.org,
akpm@linux-foundation.org, cl@gentwo.org, rientjes@google.com,
roman.gushchin@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, zhang.run@zte.com.cn,
xu.xin16@zte.com.cn, yang.tao172@zte.com.cn,
yang.yang29@zte.com.cn
Subject: Re: [PATCH v2] mm/slub: skip freelist construction for whole-slab bulk refill
Date: Thu, 2 Apr 2026 13:53:57 +0900 [thread overview]
Message-ID: <ac32ZQMxSSZ2VsNY@hyeyoo> (raw)
In-Reply-To: <fz2shejnypqsu74zpoy66senjbpyl2bbvcnoxu6hvfs77c7jtr@o2acnd2hzd4x>
On Wed, Apr 01, 2026 at 02:55:23PM +0800, Hao Li wrote:
> On Wed, Apr 01, 2026 at 12:57:25PM +0800, hu.shengming@zte.com.cn wrote:
> > @@ -4395,6 +4458,48 @@ static unsigned int alloc_from_new_slab(struct kmem_cache *s, struct slab *slab,
> > return allocated;
> > }
> >
> > +static unsigned int alloc_whole_from_new_slab(struct kmem_cache *s,
> > + struct slab *slab, void **p, bool allow_spin)
> > +{
> > +
> > + unsigned int allocated = 0;
> > + void *object, *start;
> > +
> > + if (alloc_whole_from_new_slab_random(s, slab, p, allow_spin,
> > + &allocated)) {
> > + goto done;
> > + }
> > +
> > + start = fixup_red_left(s, slab_address(slab));
> > + object = setup_object(s, start);
> > +
> > + while (allocated < slab->objects - 1) {
> > + p[allocated] = object;
> > + maybe_wipe_obj_freeptr(s, object);
> > +
> > + allocated++;
> > + object += s->size;
> > + object = setup_object(s, object);
> > + }
>
> Also, I feel the current patch contains some duplicated code like this loop.
>
> Would it make sense to split allocate_slab() into two functions?
>
> For example,
> the first part could be called allocate_slab_meta_setup() (just an example name)
> And, the second part could be allocate_slab_objects_setup(), with the core logic
> being the loop over objects. Then allocate_slab_objects_setup() could support
> two modes: one called BUILD_FREELIST, which builds the freelist, and another
> called EMIT_OBJECTS, which skips building the freelist and directly places the
> objects into the target array.
Something similar but a little bit more thoughts to unify the code
(**regardless of CONFIG_SLAB_FREELIST_RANDOM**) and avoid treating
"the whole slab->freelist fits into the sheaf" as a special case:
- allocate_slab() no longer builds the freelist.
the freelist is built only when there are objects left after
allocating objects from the new slab.
- new_slab() allocates a new slab AND builds the freelist
to keep existing behaviour.
- refill_objects() allocates a slab using allocate_slab(),
and passes it to alloc_from_new_slab().
alloc_from_new_slab() consumes some objects in random order,
and then build the freelist with the objects left (if exists).
We could actually abstract "iterating free objects in random order"
into an API, and there would be two users of the API:
- Building freelist
- Filling objects into the sheaf (without building freelist!)
Something like this...
(names here are just examples, I'm not good at naming things!)
struct freelist_iter {
int pos;
int freelist_count;
int page_limit;
void *start;
};
/* note: handling !allow_spin nicely is tricky :-) */
alloc_from_new_slab(...) {
struct freelist_iter fit;
prep_freelist_iter(s, slab, &fit, allow_spin);
while (slab->inuse < min(count, slab->objects)) {
p[slab->inuse++] = next_freelist_entry(s, &fit);
}
if (slab->inuse < slab->objects)
build_freelist(s, slab, &fit);
}
build_freelist(s, slab, fit) {
size = slab->objects - slab->inuse;
cur = next_freelist_entry(s, fit);
cur = setup_object(s, cur);
slab->freelist = cur;
for (i = 1; i < size; i++) {
next = next_freelist_entry(s, fit);
next = setup_object(s, next);
set_freepointer(s, cur, next);
cur = next;
}
}
#ifdef CONFIG_SLAB_FREELIST_RANDOM
prep_freelist_iter(s, slab, fit, allow_spin) {
fit->freelist_count = oo_objects(s->oo);
fit->page_limit = slab->objects * s->size;
fit->start = fixup_red_left(s, slab_address(slab));
if (slab->objects < 2 || !s->random_seq) {
fit->pos = 0;
} else if (allow_spin) {
fit->pos = get_random_u32_below(freelist_count);
} else {
struct rnd_state *state;
/*
* An interrupt or NMI handler might interrupt and change
* the state in the middle, but that's safe.
*/
state = &get_cpu_var(slab_rnd_state);
fit->pos = prandom_u32_state(state) % freelist_count;
put_cpu_var(slab_rnd_state);
}
return;
}
next_freelist_entry(s, fit) {
/*
* If the target page allocation failed, the number of objects on the
* page might be smaller than the usual size defined by the cache.
*/
do {
idx = s->random_seq[fit->pos];
fit->pos += 1;
if (fit->pos >= freelist_count)
fit->pos = 0;
} while (unlikely(idx >= page_limit));
return (char *)start + idx;
}
#else
prep_freelist_iter(s, slab, fit, allow_spin) {
fit->pos = 0;
return;
}
next_freelist_entry(s, fit) {
void *next = fit->start + fit->pos * s->size;
fit->pos++;
return next;
}
#endif
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2026-04-02 4:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 4:57 [PATCH v2] mm/slub: skip freelist construction for whole-slab bulk refill hu.shengming
2026-04-01 6:55 ` Hao Li
2026-04-01 13:56 ` hu.shengming
2026-04-02 9:07 ` Hao Li
2026-04-02 4:53 ` Harry Yoo (Oracle) [this message]
2026-04-02 7:03 ` hu.shengming
2026-04-02 8:12 ` Harry Yoo (Oracle)
2026-04-02 9:00 ` Hao Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac32ZQMxSSZ2VsNY@hyeyoo \
--to=harry@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=hu.shengming@zte.com.cn \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=vbabka@kernel.org \
--cc=xu.xin16@zte.com.cn \
--cc=yang.tao172@zte.com.cn \
--cc=yang.yang29@zte.com.cn \
--cc=zhang.run@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox