linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chengming Zhou <chengming.zhou@linux.dev>
To: Vlastimil Babka <vbabka@suse.cz>, cl@linux.com, penberg@kernel.org
Cc: rientjes@google.com, iamjoonsoo.kim@lge.com,
	akpm@linux-foundation.org, roman.gushchin@linux.dev,
	42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com,
	tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com,
	vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Chengming Zhou <zhouchengming@bytedance.com>
Subject: Re: [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial
Date: Tue, 24 Oct 2023 10:39:37 +0800	[thread overview]
Message-ID: <217c52ba-3b06-488d-b21a-3d2dd62438a8@linux.dev> (raw)
In-Reply-To: <79a879cc-f5f8-08ef-0afa-3688d433a756@suse.cz>

On 2023/10/24 00:00, Vlastimil Babka wrote:
> On 10/21/23 16:43, chengming.zhou@linux.dev wrote:
>> From: Chengming Zhou <zhouchengming@bytedance.com>
>>
>> Now we will freeze slabs when moving them out of node partial list to
>> cpu partial list, this method needs two cmpxchg_double operations:
>>
>> 1. freeze slab (acquire_slab()) under the node list_lock
>> 2. get_freelist() when pick used in ___slab_alloc()
>>
>> Actually we don't need to freeze when moving slabs out of node partial
>> list, we can delay freeze to use slab freelist in ___slab_alloc(), so
>> we can save one cmpxchg_double().
>>
>> And there are other good points:
>>
>> 1. The moving of slabs between node partial list and cpu partial list
>>    becomes simpler, since we don't need to freeze or unfreeze at all.
>>
>> 2. The node list_lock contention would be less, since we only need to
>>    freeze one slab under the node list_lock. (In fact, we can first
>>    move slabs out of node partial list, don't need to freeze any slab
>>    at all, so the contention on slab won't transfer to the node list_lock
>>    contention.)
>>
>> We can achieve this because there is no concurrent path would manipulate
>> the partial slab list except the __slab_free() path, which is serialized
>> now.
>>
>> Note this patch just change the parts of moving the partial slabs for
>> easy code review, we will fix other parts in the following patches.
>> Specifically this patch change three paths:
>> 1. get partial slab from node: get_partial_node()
>> 2. put partial slab to node: __unfreeze_partials()
>> 3. cache partail slab on cpu when __slab_free()
> 
> So the problem with this approach that one patch breaks things and another
> one fixes them up, is that git bisect becomes problematic, so we shouldn't
> do that and instead bite the bullet and deal with a potentially large patch.
> At some level it's unavoidable as one has to grasp all the moving pieces
> anyway to see e.g. if the changes in allocation path are compatible with the
> changes in freeing.
> When possible, we can do preparatory stuff that doesn't break things like in
> patches 1 and 2, maybe get_cpu_partial() could also be introduced (even if
> unused) before switching the world over to the new scheme in a single patch
> (and possible cleanups afterwards). So would it be possible to redo it in
> such way please?

Ok, I will change to this way. I didn't know how to handle this potentially
large patch gracefully at first. Your detailed advice is very helpful to me,
I will prepare all possible preparatory stuff, then change all parts over
in one patch.

> 
> <snip>
> 
>> @@ -2621,23 +2622,7 @@ static void __unfreeze_partials(struct kmem_cache *s, struct slab *partial_slab)
>>  			spin_lock_irqsave(&n->list_lock, flags);
>>  		}
>>  
>> -		do {
>> -
>> -			old.freelist = slab->freelist;
>> -			old.counters = slab->counters;
>> -			VM_BUG_ON(!old.frozen);
>> -
>> -			new.counters = old.counters;
>> -			new.freelist = old.freelist;
>> -
>> -			new.frozen = 0;
>> -
>> -		} while (!__slab_update_freelist(s, slab,
>> -				old.freelist, old.counters,
>> -				new.freelist, new.counters,
>> -				"unfreezing slab"));
>> -
>> -		if (unlikely(!new.inuse && n->nr_partial >= s->min_partial)) {
>> +		if (unlikely(!slab->inuse && n->nr_partial >= s->min_partial)) {
>>  			slab->next = slab_to_discard;
>>  			slab_to_discard = slab;
>>  		} else {
>> @@ -3634,18 +3619,8 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>  		was_frozen = new.frozen;
>>  		new.inuse -= cnt;
>>  		if ((!new.inuse || !prior) && !was_frozen) {
>> -
>> -			if (kmem_cache_has_cpu_partial(s) && !prior) {
>> -
>> -				/*
>> -				 * Slab was on no list before and will be
>> -				 * partially empty
>> -				 * We can defer the list move and instead
>> -				 * freeze it.
>> -				 */
>> -				new.frozen = 1;
>> -
>> -			} else { /* Needs to be taken off a list */
>> +			/* Needs to be taken off a list */
>> +			if (!kmem_cache_has_cpu_partial(s) || prior) {
>>  
>>  				n = get_node(s, slab_nid(slab));
>>  				/*
>> @@ -3675,7 +3650,7 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>  			 * activity can be necessary.
>>  			 */
>>  			stat(s, FREE_FROZEN);
>> -		} else if (new.frozen) {
>> +		} else if (kmem_cache_has_cpu_partial(s) && !prior) {
>>  			/*
>>  			 * If we just froze the slab then put it onto the
>>  			 * per cpu partial list.
> 
> I think this comment is now misleading, we didn't freeze the slab, so it's
> now something like "if we started with a full slab...".

Ok, will check and change these inconsistent comments by the way.

Thanks!


  reply	other threads:[~2023-10-24  2:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-21 14:43 [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 1/6] slub: Keep track of whether slub is on the per-node partial list chengming.zhou
2023-10-23 12:32   ` Matthew Wilcox
2023-10-23 16:22     ` Matthew Wilcox
2023-10-24  1:57       ` Chengming Zhou
2023-10-21 14:43 ` [RFC PATCH v2 2/6] slub: Prepare __slab_free() for unfrozen partial slab out of node " chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial chengming.zhou
2023-10-23 16:00   ` Vlastimil Babka
2023-10-24  2:39     ` Chengming Zhou [this message]
2023-10-21 14:43 ` [RFC PATCH v2 4/6] slub: Simplify acquire_slab() chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 5/6] slub: Introduce get_cpu_partial() chengming.zhou
2023-10-21 14:43 ` [RFC PATCH v2 6/6] slub: Optimize deactivate_slab() chengming.zhou
2023-10-22 14:52 ` [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs Hyeonggon Yoo
2023-10-24  2:02   ` Chengming Zhou
2023-10-23 15:46 ` Vlastimil Babka
2023-10-23 17:00   ` Christoph Lameter (Ampere)
2023-10-23 18:44     ` Vlastimil Babka
2023-10-23 21:05       ` Christoph Lameter (Ampere)
2023-10-24  8:19         ` Vlastimil Babka
2023-10-24 11:03         ` Chengming Zhou
2023-10-24  2:20   ` Chengming Zhou
2023-10-24  8:20     ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=217c52ba-3b06-488d-b21a-3d2dd62438a8@linux.dev \
    --to=chengming.zhou@linux.dev \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lrh2000@pku.edu.cn \
    --cc=maz@kernel.org \
    --cc=pcc@google.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).