From: Chengming Zhou <chengming.zhou@linux.dev>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: vbabka@suse.cz, cl@linux.com, penberg@kernel.org,
rientjes@google.com, iamjoonsoo.kim@lge.com,
akpm@linux-foundation.org, roman.gushchin@linux.dev,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Chengming Zhou <zhouchengming@bytedance.com>
Subject: Re: [PATCH v5 7/9] slub: Optimize deactivate_slab()
Date: Sun, 3 Dec 2023 19:47:18 +0800 [thread overview]
Message-ID: <294cf54b-cdf5-4441-9924-67d934e54883@linux.dev> (raw)
In-Reply-To: <CAB=+i9T+aLEj6BHdXXJxqY914O1ZVCbmL8iL_5wiFB3dQN6yUg@mail.gmail.com>
On 2023/12/3 19:19, Hyeonggon Yoo wrote:
> On Sun, Dec 3, 2023 at 7:26 PM Chengming Zhou <chengming.zhou@linux.dev> wrote:
>>
>> On 2023/12/3 17:23, Hyeonggon Yoo wrote:
>>> On Thu, Nov 2, 2023 at 12:25 PM <chengming.zhou@linux.dev> wrote:
>>>>
>>>> From: Chengming Zhou <zhouchengming@bytedance.com>
>>>>
>>>> Since the introduce of unfrozen slabs on cpu partial list, we don't
>>>> need to synchronize the slab frozen state under the node list_lock.
>>>>
>>>> The caller of deactivate_slab() and the caller of __slab_free() won't
>>>> manipulate the slab list concurrently.
>>>>
>>>> So we can get node list_lock in the last stage if we really need to
>>>> manipulate the slab list in this path.
>>>>
>>>> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>>>> ---
>>>> mm/slub.c | 79 ++++++++++++++++++-------------------------------------
>>>> 1 file changed, 26 insertions(+), 53 deletions(-)
>>>>
>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>> index bcb5b2c4e213..d137468fe4b9 100644
>>>> --- a/mm/slub.c
>>>> +++ b/mm/slub.c
>>>> @@ -2468,10 +2468,8 @@ static void init_kmem_cache_cpus(struct kmem_cache *s)
>>>> static void deactivate_slab(struct kmem_cache *s, struct slab *slab,
>>>> void *freelist)
>>>> {
>>>> - enum slab_modes { M_NONE, M_PARTIAL, M_FREE, M_FULL_NOLIST };
>>>> struct kmem_cache_node *n = get_node(s, slab_nid(slab));
>>>> int free_delta = 0;
>>>> - enum slab_modes mode = M_NONE;
>>>> void *nextfree, *freelist_iter, *freelist_tail;
>>>> int tail = DEACTIVATE_TO_HEAD;
>>>> unsigned long flags = 0;
>>>> @@ -2509,65 +2507,40 @@ static void deactivate_slab(struct kmem_cache *s, struct slab *slab,
>>>> /*
>>>> * Stage two: Unfreeze the slab while splicing the per-cpu
>>>> * freelist to the head of slab's freelist.
>>>> - *
>>>> - * Ensure that the slab is unfrozen while the list presence
>>>> - * reflects the actual number of objects during unfreeze.
>>>> - *
>>>> - * We first perform cmpxchg holding lock and insert to list
>>>> - * when it succeed. If there is mismatch then the slab is not
>>>> - * unfrozen and number of objects in the slab may have changed.
>>>> - * Then release lock and retry cmpxchg again.
>>>> */
>>>> -redo:
>>>> -
>>>> - old.freelist = READ_ONCE(slab->freelist);
>>>> - old.counters = READ_ONCE(slab->counters);
>>>> - VM_BUG_ON(!old.frozen);
>>>> -
>>>> - /* Determine target state of the slab */
>>>> - new.counters = old.counters;
>>>> - if (freelist_tail) {
>>>> - new.inuse -= free_delta;
>>>> - set_freepointer(s, freelist_tail, old.freelist);
>>>> - new.freelist = freelist;
>>>> - } else
>>>> - new.freelist = old.freelist;
>>>> -
>>>> - new.frozen = 0;
>>>> + do {
>>>> + old.freelist = READ_ONCE(slab->freelist);
>>>> + old.counters = READ_ONCE(slab->counters);
>>>> + VM_BUG_ON(!old.frozen);
>>>> +
>>>> + /* Determine target state of the slab */
>>>> + new.counters = old.counters;
>>>> + new.frozen = 0;
>>>> + if (freelist_tail) {
>>>> + new.inuse -= free_delta;
>>>> + set_freepointer(s, freelist_tail, old.freelist);
>>>> + new.freelist = freelist;
>>>> + } else {
>>>> + new.freelist = old.freelist;
>>>> + }
>>>> + } while (!slab_update_freelist(s, slab,
>>>> + old.freelist, old.counters,
>>>> + new.freelist, new.counters,
>>>> + "unfreezing slab"));
>>>>
>>>> + /*
>>>> + * Stage three: Manipulate the slab list based on the updated state.
>>>> + */
>>>
>>> deactivate_slab() might unconsciously put empty slabs into partial list, like:
>>>
>>> deactivate_slab() __slab_free()
>>> cmpxchg(), slab's not empty
>>> cmpxchg(), slab's empty
>>> and unfrozen
>>
>> Hi,
>>
>> Sorry, but I don't get it here how __slab_free() can see the slab empty,
>> since the slab is not empty from deactivate_slab() path, and it can't be
>> used by any CPU at that time?
>
> The scenario is CPU B previously allocated an object from slab X, but
> put it into node partial list and then CPU A have taken slab X into cpu slab.
>
> While slab X is CPU A's cpu slab, when CPU B frees an object from slab X,
> it puts the object into slab X's freelist using cmpxchg.
>
> Let's say in CPU A the deactivation path performs cmpxchg and X.inuse was 1,
> and then CPU B frees (__slab_free()) to slab X's freelist using cmpxchg,
> _before_ slab X's put into partial list by CPU A.
>
> Then CPU A thinks it's not empty so put it into partial list, but by CPU B
> the slab has become empty.
>
> Maybe I am confused, in that case please tell me I'm wrong :)
>
Ah, you're right! I misunderstood the slab "empty" with "full". :)
Yes, in this case the "empty" slab would be put into the node partial list,
and it should be fine in the real world as you noted earlier.
Thanks!
next prev parent reply other threads:[~2023-12-03 11:47 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-02 3:23 [PATCH v5 0/9] slub: Delay freezing of CPU partial slabs chengming.zhou
2023-11-02 3:23 ` [PATCH v5 1/9] slub: Reflow ___slab_alloc() chengming.zhou
2023-11-22 0:26 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 2/9] slub: Change get_partial() interfaces to return slab chengming.zhou
2023-11-22 1:09 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 3/9] slub: Keep track of whether slub is on the per-node partial list chengming.zhou
2023-11-22 1:21 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 4/9] slub: Prepare __slab_free() for unfrozen partial slab out of node " chengming.zhou
2023-12-03 6:01 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 5/9] slub: Introduce freeze_slab() chengming.zhou
2023-11-02 3:23 ` [PATCH v5 6/9] slub: Delay freezing of partial slabs chengming.zhou
2023-11-14 5:44 ` kernel test robot
2023-11-20 18:49 ` Mark Brown
2023-11-21 0:58 ` Chengming Zhou
2023-11-21 1:29 ` Mark Brown
2023-11-21 15:47 ` Chengming Zhou
2023-11-21 18:21 ` Mark Brown
2023-11-22 8:52 ` Vlastimil Babka
2023-11-22 9:37 ` Vlastimil Babka
2023-11-22 11:27 ` Mark Brown
2023-11-22 11:35 ` Chengming Zhou
2023-11-22 11:40 ` Vlastimil Babka
2023-11-22 11:54 ` Chengming Zhou
2023-11-22 13:19 ` Vlastimil Babka
2023-11-22 14:28 ` Chengming Zhou
2023-11-22 14:32 ` Vlastimil Babka
2023-12-03 6:53 ` Hyeonggon Yoo
2023-12-03 10:15 ` Chengming Zhou
2023-12-04 16:58 ` Vlastimil Babka
2023-11-02 3:23 ` [PATCH v5 7/9] slub: Optimize deactivate_slab() chengming.zhou
2023-12-03 9:23 ` Hyeonggon Yoo
2023-12-03 10:26 ` Chengming Zhou
2023-12-03 11:19 ` Hyeonggon Yoo
2023-12-03 11:47 ` Chengming Zhou [this message]
2023-12-04 17:55 ` Vlastimil Babka
2023-12-05 0:20 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 8/9] slub: Rename all *unfreeze_partials* functions to *put_partials* chengming.zhou
2023-12-03 9:27 ` Hyeonggon Yoo
2023-11-02 3:23 ` [PATCH v5 9/9] slub: Update frozen slabs documentations in the source chengming.zhou
2023-12-03 9:47 ` Hyeonggon Yoo
2023-12-04 21:41 ` Christoph Lameter (Ampere)
2023-12-05 6:06 ` Chengming Zhou
2023-12-05 9:39 ` Vlastimil Babka
2023-11-13 8:36 ` [PATCH v5 0/9] slub: Delay freezing of CPU partial slabs Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=294cf54b-cdf5-4441-9924-67d934e54883@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=vbabka@suse.cz \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).