From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Hao Li <hao.li@linux.dev>, Marcelo Tosatti <mtosatti@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] slab: distinguish lock and trylock for sheaf_flush_main()
Date: Thu, 26 Feb 2026 15:50:16 +0100 [thread overview]
Message-ID: <c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org> (raw)
In-Reply-To: <20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz>
On 2/11/26 10:42, Vlastimil Babka wrote:
> sheaf_flush_main() can be called from __pcs_replace_full_main() where
> the trylock can in theory fail, and pcs_flush_all() where it's not
> expected to and it would be actually a problem if it failed and left the
> main sheaf not flushed.
Thinking about this more, I now think it's not a theoretical issue because
on PREEMPT_RT I think pcs_flush_all() can preempt someone holding the lock
(on PREEMPT_RT it doesn't have to be an irq handler preempting a holder),
and then fail to flush the main sheaf silently.
The impact is probably limited though - if this failure to flush happens in
__kmem_cache_shutdown(), it means someone was destroying a cache while using
it, so that was already buggy. slab_mem_going_offline_callback() could be
where this matters although it's unlikely someone would do memory hotplug
together with PREEMPT_RT.
But maybe still worth tagging this as Fixes: 2d517aa09bbc ("slab: add opt-in
caching layer of percpu sheaves") and Cc stable and sending it as a hotfix.
> To make this explicit, split the function into sheaf_flush_main() (using
> local_lock()) and sheaf_try_flush_main() (using local_trylock()) where
> both call __sheaf_flush_main_batch() to flush a single batch of objects.
> This will allow lockdep to verify our assumptions.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/slub.c | 47 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 10 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 18c30872d196..12912b29f5bb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2844,19 +2844,19 @@ static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
> * object pointers are moved to a on-stack array under the lock. To bound the
> * stack usage, limit each batch to PCS_BATCH_MAX.
> *
> - * returns true if at least partially flushed
> + * Must be called with s->cpu_sheaves->lock locked, returns with the lock
> + * unlocked.
> + *
> + * Returns how many objects are remaining to be flushed
> */
> -static bool sheaf_flush_main(struct kmem_cache *s)
> +static unsigned int __sheaf_flush_main_batch(struct kmem_cache *s)
> {
> struct slub_percpu_sheaves *pcs;
> unsigned int batch, remaining;
> void *objects[PCS_BATCH_MAX];
> struct slab_sheaf *sheaf;
> - bool ret = false;
>
> -next_batch:
> - if (!local_trylock(&s->cpu_sheaves->lock))
> - return ret;
> + lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
>
> pcs = this_cpu_ptr(s->cpu_sheaves);
> sheaf = pcs->main;
> @@ -2874,10 +2874,37 @@ static bool sheaf_flush_main(struct kmem_cache *s)
>
> stat_add(s, SHEAF_FLUSH, batch);
>
> - ret = true;
> + return remaining;
> +}
>
> - if (remaining)
> - goto next_batch;
> +static void sheaf_flush_main(struct kmem_cache *s)
> +{
> + unsigned int remaining;
> +
> + do {
> + local_lock(&s->cpu_sheaves->lock);
> +
> + remaining = __sheaf_flush_main_batch(s);
> +
> + } while (remaining);
> +}
> +
> +/*
> + * Returns true if the main sheaf was at least partially flushed.
> + */
> +static bool sheaf_try_flush_main(struct kmem_cache *s)
> +{
> + unsigned int remaining;
> + bool ret = false;
> +
> + do {
> + if (!local_trylock(&s->cpu_sheaves->lock))
> + return ret;
> +
> + ret = true;
> + remaining = __sheaf_flush_main_batch(s);
> +
> + } while (remaining);
>
> return ret;
> }
> @@ -5685,7 +5712,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
> if (put_fail)
> stat(s, BARN_PUT_FAIL);
>
> - if (!sheaf_flush_main(s))
> + if (!sheaf_try_flush_main(s))
> return NULL;
>
> if (!local_trylock(&s->cpu_sheaves->lock))
>
> ---
> base-commit: 27125df9a5d3b4cfd03bce3a8ec405a368cc9aae
> change-id: 20260211-b4-sheaf-flush-2eb99a9c8bfb
>
> Best regards,
next prev parent reply other threads:[~2026-02-26 14:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 9:42 [PATCH] slab: distinguish lock and trylock for sheaf_flush_main() Vlastimil Babka
2026-02-12 3:11 ` Harry Yoo
2026-02-12 6:48 ` Hao Li
2026-02-26 14:50 ` Vlastimil Babka (SUSE) [this message]
2026-03-02 9:56 ` Vlastimil Babka (SUSE)
2026-03-04 1:05 ` Harry Yoo
2026-03-04 3:01 ` Hao Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.