From: Ming Lei <ming.lei@redhat.com>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Harry Yoo <harry.yoo@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, Hao Li <hao.li@linux.dev>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation
Date: Fri, 6 Mar 2026 18:22:37 +0800 [thread overview]
Message-ID: <aaqq7YmUcOht3GWH@fedora> (raw)
In-Reply-To: <d5cd4eb4-1703-41af-9c71-842ae33edbdf@kernel.org>
On Fri, Mar 06, 2026 at 09:47:27AM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/6/26 05:55, Harry Yoo wrote:
> > On Thu, Feb 26, 2026 at 07:02:11PM +0100, Vlastimil Babka (SUSE) wrote:
> >> On 2/25/26 10:31, Ming Lei wrote:
> >> > Hi Vlastimil,
> >> >
> >> > On Wed, Feb 25, 2026 at 09:45:03AM +0100, Vlastimil Babka (SUSE) wrote:
> >> >> On 2/24/26 21:27, Vlastimil Babka wrote:
> >> >> >
> >> >> > It made sense to me not to refill sheaves when we can't reclaim, but I
> >> >> > didn't anticipate this interaction with mempools. We could change them
> >> >> > but there might be others using a similar pattern. Maybe it would be for
> >> >> > the best to just drop that heuristic from __pcs_replace_empty_main()
> >> >> > (but carefully as some deadlock avoidance depends on it, we might need
> >> >> > to e.g. replace it with gfpflags_allow_spinning()). I'll send a patch
> >> >> > tomorrow to test this theory, unless someone beats me to it (feel free to).
> >> >> Could you try this then, please? Thanks!
> >> >
> >> > Thanks for working on this issue!
> >> >
> >> > Unfortunately the patch doesn't make a difference on IOPS in the perf test,
> >> > follows the collected perf profile on linus tree(basically 7.0-rc1 with your patch):
> >>
> >> what about this patch in addition to the previous one? Thanks.
> >>
> >> ----8<----
> >> From d3e8118c078996d1372a9f89285179d93971fdb2 Mon Sep 17 00:00:00 2001
> >> From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
> >> Date: Thu, 26 Feb 2026 18:59:56 +0100
> >> Subject: [PATCH] mm/slab: put barn on every online node
> >>
> >> Including memoryless nodes.
> >>
> >> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> >> ---
> >
> > Just taking a quick grasp...
> >
> >> @@ -6121,7 +6122,8 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
> >> if (unlikely(!slab_free_hook(s, object, slab_want_init_on_free(s), false)))
> >> return;
> >>
> >> - if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id())
> >> + if (likely(!IS_ENABLED(CONFIG_NUMA) || (slab_nid(slab) == numa_mem_id())
> >> + || !node_isset(slab_nid(slab), slab_nodes))
> >
> > I think you intended !node_isset(numa_mem_id(), slab_nodes)?
> >
> > "Skip freeing to pcs if it's remote free, but memoryless nodes is
> > an exception".
>
> Indeed, thanks! Ming, could you retry with that fixed up please?
After applying the following change, IOPS is ~25M:
- delta change on the two patches
diff --git a/mm/slub.c b/mm/slub.c
index 085fe49eec68..56fe8bd956c0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6142,7 +6142,7 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
return;
if (likely(!IS_ENABLED(CONFIG_NUMA) || (slab_nid(slab) == numa_mem_id())
- || !node_isset(slab_nid(slab), slab_nodes))
+ || !node_isset(numa_mem_id(), slab_nodes))
&& likely(!slab_test_pfmemalloc(slab))) {
if (likely(free_to_pcs(s, object, true)))
return;
- slab stat on patched `815c8e35511d Merge branch 'slab/for-7.0/sheaves' into slab/for-next`
# (cd /sys/kernel/slab/bio-256/ && find . -type f -exec grep -aH . {} \;)
./remote_node_defrag_ratio:100
./total_objects:7395 N1=3876 N5=3519
./alloc_fastpath:507619662 C0=70 C1=27608632 C3=28990301 C5=35098386 C6=9 C7=35782152 C8=115 C9=31757274 C10=32 C11=30087065 C12=34 C13=31615065 C14=7 C15=31798233 C17=30695955 C18=128 C19=32204853 C20=64 C21=36842392 C23=36212376 C25=30013640 C27=29055001 C29=29990232 C30=48 C31=29867595 C36=2 C50=1
./cpu_slabs:0
./objects:7232 N1=3816 N5=3416
./sheaf_return_slow:0
./objects_partial:500 N1=195 N5=305
./sheaf_return_fast:0
./cpu_partial:0
./free_slowpath:20 C4=20
./barn_get_fail:260 C1=6 C3=26 C5=26 C7=7 C9=5 C10=2 C11=26 C12=2 C13=10 C14=1 C15=19 C17=8 C18=5 C19=19 C20=1 C21=9 C23=22 C25=11 C27=21 C29=26 C31=6 C36=1 C50=1
./sheaf_prefill_oversize:0
./skip_kfence:0
./min_partial:5
./order_fallback:0
./sheaf_capacity:28
./sheaf_flush:28 C24=28
./free_rcu_sheaf:0
./sheaf_alloc:178 C0=4 C2=9 C3=1 C4=9 C5=65 C6=4 C8=5 C10=8 C11=1 C12=4 C13=1 C14=8 C15=1 C16=5 C18=8 C19=1 C20=3 C22=10 C23=1 C24=5 C25=1 C26=7 C27=1 C28=10 C29=1 C30=2 C31=1 C36=1 C50=1
./sheaf_free:0
./sheaf_prefill_slow:0
./sheaf_prefill_fast:0
./poison:0
./red_zone:0
./free_slab:0
./slabs:145 N1=76 N5=69
./barn_get:18129029 C0=3 C1=986017 C3=1035342 C5=1253488 C6=1 C7=1277927 C8=5 C9=1134184 C11=1074513 C13=1129100 C15=1135633 C17=1096277 C19=1150155 C20=2 C21=1315791 C23=1293278 C25=1071905 C27=1037658 C29=1071054 C30=2 C31=1066694
./alloc_slowpath:0
./destroy_by_rcu:1
./free_rcu_sheaf_fail:0
./barn_put:18129105 C0=986015 C2=1035357 C4=1253502 C6=1277924 C8=1134182 C10=1074529 C12=1129101 C14=1135641 C16=1096273 C18=1150168 C20=1315792 C22=1293288 C24=1071905 C26=1037668 C28=1071069 C30=1066691
./usersize:0
./sanity_checks:0
./barn_put_fail:1 C24=1
./align:64
./alloc_node_mismatch:0
./alloc_slab:145 C1=3 C3=19 C5=6 C7=3 C9=3 C10=2 C11=18 C12=2 C13=6 C14=1 C15=12 C17=8 C18=3 C19=12 C21=2 C23=5 C25=7 C27=12 C29=15 C31=4 C36=1 C50=1
./free_remove_partial:0
./aliases:0
./store_user:0
./trace:0
./reclaim_account:0
./order:2
./sheaf_refill:7280 C1=168 C3=728 C5=728 C7=196 C9=140 C10=56 C11=728 C12=56 C13=280 C14=28 C15=532 C17=224 C18=140 C19=532 C20=28 C21=252 C23=616 C25=308 C27=588 C29=728 C31=168 C36=28 C50=28
./object_size:256
./free_fastpath:507615526 C0=27608438 C2=28990052 C4=35098103 C6=35781903 C8=31757101 C10=30086841 C12=31614841 C14=31797983 C16=30695700 C18=32204722 C19=1 C20=36842201 C22=36212117 C24=30013416 C26=29054742 C28=29989974 C30=29867383 C31=4 C39=2 C47=2
./hwcache_align:1
./cmpxchg_double_fail:0
./objs_per_slab:51
./partial:13 N1=5 N5=8
./slabs_cpu_partial:0(0)
./free_add_partial:117 C1=3 C3=7 C5=19 C7=4 C9=2 C11=8 C13=4 C15=7 C18=2 C19=7 C20=1 C21=7 C23=17 C24=3 C25=4 C27=9 C29=11 C31=2
./slab_size:320
./cache_dma:0
Thanks,
Ming
next prev parent reply other threads:[~2026-03-06 10:22 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 2:52 [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation Ming Lei
2026-02-24 5:00 ` Harry Yoo
2026-02-24 9:07 ` Ming Lei
2026-02-25 5:32 ` Hao Li
2026-02-25 6:54 ` Harry Yoo
2026-02-25 7:06 ` Hao Li
2026-02-25 7:19 ` Harry Yoo
2026-02-25 8:19 ` Hao Li
2026-02-25 8:41 ` Harry Yoo
2026-02-25 8:54 ` Hao Li
2026-02-25 8:21 ` Harry Yoo
2026-02-24 6:51 ` Hao Li
2026-02-24 7:10 ` Harry Yoo
2026-02-24 7:41 ` Hao Li
2026-02-24 20:27 ` Vlastimil Babka
2026-02-25 5:24 ` Harry Yoo
2026-02-25 8:45 ` Vlastimil Babka (SUSE)
2026-02-25 9:31 ` Ming Lei
2026-02-25 11:29 ` Vlastimil Babka (SUSE)
2026-02-25 12:24 ` Ming Lei
2026-02-25 13:22 ` Vlastimil Babka (SUSE)
2026-02-26 18:02 ` Vlastimil Babka (SUSE)
2026-02-27 9:23 ` Ming Lei
2026-03-05 13:05 ` Vlastimil Babka (SUSE)
2026-03-05 15:48 ` Ming Lei
2026-03-06 1:01 ` Ming Lei
2026-03-06 4:17 ` Hao Li
2026-03-06 4:55 ` Harry Yoo
2026-03-06 8:32 ` Hao Li
2026-03-06 8:47 ` Vlastimil Babka (SUSE)
2026-03-06 10:22 ` Ming Lei [this message]
2026-03-11 1:10 ` Harry Yoo
2026-03-11 10:15 ` Ming Lei
2026-03-11 10:43 ` Ming Lei
2026-03-12 4:11 ` Harry Yoo
2026-03-12 11:26 ` Hao Li
2026-03-12 11:56 ` Ming Lei
2026-03-12 12:13 ` Hao Li
2026-03-12 14:50 ` Ming Lei
2026-03-13 3:26 ` Hao Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaqq7YmUcOht3GWH@fedora \
--to=ming.lei@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=vbabka@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.