From: Chengming Zhou <chengming.zhou@linux.dev>
To: Vlastimil Babka <vbabka@suse.cz>, cl@linux.com, penberg@kernel.org
Cc: rientjes@google.com, iamjoonsoo.kim@lge.com,
akpm@linux-foundation.org, roman.gushchin@linux.dev,
42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com,
tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com,
vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Chengming Zhou <zhouchengming@bytedance.com>
Subject: Re: [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial
Date: Tue, 24 Oct 2023 10:39:37 +0800 [thread overview]
Message-ID: <217c52ba-3b06-488d-b21a-3d2dd62438a8@linux.dev> (raw)
In-Reply-To: <79a879cc-f5f8-08ef-0afa-3688d433a756@suse.cz>
On 2023/10/24 00:00, Vlastimil Babka wrote:
> On 10/21/23 16:43, chengming.zhou@linux.dev wrote:
>> From: Chengming Zhou <zhouchengming@bytedance.com>
>>
>> Now we will freeze slabs when moving them out of node partial list to
>> cpu partial list, this method needs two cmpxchg_double operations:
>>
>> 1. freeze slab (acquire_slab()) under the node list_lock
>> 2. get_freelist() when pick used in ___slab_alloc()
>>
>> Actually we don't need to freeze when moving slabs out of node partial
>> list, we can delay freeze to use slab freelist in ___slab_alloc(), so
>> we can save one cmpxchg_double().
>>
>> And there are other good points:
>>
>> 1. The moving of slabs between node partial list and cpu partial list
>> becomes simpler, since we don't need to freeze or unfreeze at all.
>>
>> 2. The node list_lock contention would be less, since we only need to
>> freeze one slab under the node list_lock. (In fact, we can first
>> move slabs out of node partial list, don't need to freeze any slab
>> at all, so the contention on slab won't transfer to the node list_lock
>> contention.)
>>
>> We can achieve this because there is no concurrent path would manipulate
>> the partial slab list except the __slab_free() path, which is serialized
>> now.
>>
>> Note this patch just change the parts of moving the partial slabs for
>> easy code review, we will fix other parts in the following patches.
>> Specifically this patch change three paths:
>> 1. get partial slab from node: get_partial_node()
>> 2. put partial slab to node: __unfreeze_partials()
>> 3. cache partail slab on cpu when __slab_free()
>
> So the problem with this approach that one patch breaks things and another
> one fixes them up, is that git bisect becomes problematic, so we shouldn't
> do that and instead bite the bullet and deal with a potentially large patch.
> At some level it's unavoidable as one has to grasp all the moving pieces
> anyway to see e.g. if the changes in allocation path are compatible with the
> changes in freeing.
> When possible, we can do preparatory stuff that doesn't break things like in
> patches 1 and 2, maybe get_cpu_partial() could also be introduced (even if
> unused) before switching the world over to the new scheme in a single patch
> (and possible cleanups afterwards). So would it be possible to redo it in
> such way please?
Ok, I will change to this way. I didn't know how to handle this potentially
large patch gracefully at first. Your detailed advice is very helpful to me,
I will prepare all possible preparatory stuff, then change all parts over
in one patch.
>
> <snip>
>
>> @@ -2621,23 +2622,7 @@ static void __unfreeze_partials(struct kmem_cache *s, struct slab *partial_slab)
>> spin_lock_irqsave(&n->list_lock, flags);
>> }
>>
>> - do {
>> -
>> - old.freelist = slab->freelist;
>> - old.counters = slab->counters;
>> - VM_BUG_ON(!old.frozen);
>> -
>> - new.counters = old.counters;
>> - new.freelist = old.freelist;
>> -
>> - new.frozen = 0;
>> -
>> - } while (!__slab_update_freelist(s, slab,
>> - old.freelist, old.counters,
>> - new.freelist, new.counters,
>> - "unfreezing slab"));
>> -
>> - if (unlikely(!new.inuse && n->nr_partial >= s->min_partial)) {
>> + if (unlikely(!slab->inuse && n->nr_partial >= s->min_partial)) {
>> slab->next = slab_to_discard;
>> slab_to_discard = slab;
>> } else {
>> @@ -3634,18 +3619,8 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>> was_frozen = new.frozen;
>> new.inuse -= cnt;
>> if ((!new.inuse || !prior) && !was_frozen) {
>> -
>> - if (kmem_cache_has_cpu_partial(s) && !prior) {
>> -
>> - /*
>> - * Slab was on no list before and will be
>> - * partially empty
>> - * We can defer the list move and instead
>> - * freeze it.
>> - */
>> - new.frozen = 1;
>> -
>> - } else { /* Needs to be taken off a list */
>> + /* Needs to be taken off a list */
>> + if (!kmem_cache_has_cpu_partial(s) || prior) {
>>
>> n = get_node(s, slab_nid(slab));
>> /*
>> @@ -3675,7 +3650,7 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>> * activity can be necessary.
>> */
>> stat(s, FREE_FROZEN);
>> - } else if (new.frozen) {
>> + } else if (kmem_cache_has_cpu_partial(s) && !prior) {
>> /*
>> * If we just froze the slab then put it onto the
>> * per cpu partial list.
>
> I think this comment is now misleading, we didn't freeze the slab, so it's
> now something like "if we started with a full slab...".
Ok, will check and change these inconsistent comments by the way.
Thanks!
next prev parent reply other threads:[~2023-10-24 2:41 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-21 14:43 [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 1/6] slub: Keep track of whether slub is on the per-node partial list chengming.zhou
2023-10-23 12:32 ` Matthew Wilcox
2023-10-23 16:22 ` Matthew Wilcox
2023-10-24 1:57 ` Chengming Zhou
2023-10-21 14:43 ` [RFC PATCH v2 2/6] slub: Prepare __slab_free() for unfrozen partial slab out of node " chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial chengming.zhou
2023-10-23 16:00 ` Vlastimil Babka
2023-10-24 2:39 ` Chengming Zhou [this message]
2023-10-21 14:43 ` [RFC PATCH v2 4/6] slub: Simplify acquire_slab() chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 5/6] slub: Introduce get_cpu_partial() chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 6/6] slub: Optimize deactivate_slab() chengming.zhou
2023-10-22 14:52 ` [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs Hyeonggon Yoo
2023-10-24 2:02 ` Chengming Zhou
2023-10-23 15:46 ` Vlastimil Babka
2023-10-23 17:00 ` Christoph Lameter (Ampere)
2023-10-23 18:44 ` Vlastimil Babka
2023-10-23 21:05 ` Christoph Lameter (Ampere)
2023-10-24 8:19 ` Vlastimil Babka
2023-10-24 11:03 ` Chengming Zhou
2023-10-24 2:20 ` Chengming Zhou
2023-10-24 8:20 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=217c52ba-3b06-488d-b21a-3d2dd62438a8@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lrh2000@pku.edu.cn \
--cc=maz@kernel.org \
--cc=pcc@google.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=ruansy.fnst@fujitsu.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=willy@infradead.org \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).