Re: [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Hao Li <hao.li@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Harry Yoo <harry.yoo@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org
Subject: Re: [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation
Date: Thu, 12 Mar 2026 22:50:32 +0800	[thread overview]
Message-ID: <abLSuHU3eJrel6KI@fedora> (raw)
In-Reply-To: <iypdqo2s5oobenjrmoqqplgshsz65bwegih7kxhgd547fcofm7@yb6xqors6snx>

On Thu, Mar 12, 2026 at 08:13:18PM +0800, Hao Li wrote:
> On Thu, Mar 12, 2026 at 07:56:31PM +0800, Ming Lei wrote:
> > On Thu, Mar 12, 2026 at 07:26:28PM +0800, Hao Li wrote:
> > > On Tue, Feb 24, 2026 at 10:52:28AM +0800, Ming Lei wrote:
> > > > Hello Vlastimil and MM guys,
> > > > 
> > > > The SLUB "sheaves" series merged via 815c8e35511d ("Merge branch
> > > > 'slab/for-7.0/sheaves' into slab/for-next") introduces a severe
> > > > performance regression for workloads with persistent cross-CPU
> > > > alloc/free patterns. ublk null target benchmark IOPS drops
> > > > significantly compared to v6.19: from ~36M IOPS to ~13M IOPS (~64%
> > > > drop).
> > > > 
> > > > Bisecting within the sheaves series is blocked by a kernel panic at
> > > > 17c38c88294d ("slab: remove cpu (partial) slabs usage from allocation
> > > > paths"), so the exact first bad commit could not be identified.
> > > > 
> > > > Reproducer
> > > > ==========
> > > > 
> > > > Hardware: NUMA machine with >= 32 CPUs
> > > > Kernel:   v7.0-rc (with slab/for-7.0/sheaves merged)
> > > > 
> > > >     # build kublk selftest
> > > >     make -C tools/testing/selftests/ublk/
> > > > 
> > > >     # create ublk null target device with 16 queues
> > > >     tools/testing/selftests/ublk/kublk add -t null -q 16
> > > > 
> > > >     # run fio/t/io_uring benchmark: 16 jobs, 20 seconds, non-polled
> > > >     taskset -c 0-31 fio/t/io_uring -p0 -n 16 -r 20 /dev/ublkb0
> > > > 
> > > >     # cleanup
> > > >     tools/testing/selftests/ublk/kublk del -n 0
> > > > 
> > > > Good: v6.19 (and 41f1a08645ab, the mainline parent of the slab merge)
> > > > Bad:  815c8e35511d (Merge branch 'slab/for-7.0/sheaves' into slab/for-next)
> > > > 
> > > 
> > > Hi Ming,
> > > 
> > > I also have a similar machine, but my test results show that the IOPS is below
> > > 1M, only around 900K. That seems quite strange to me.
> > > 
> > > My test commands are:
> > > 
> > > ```bash
> > > tools/testing/selftests/ublk/kublk add -t null -q 16
> > > taskset -c 24-47 /home/haolee/fio/t/io_uring -p0 -n 16 -r 20 /dev/ublkb0
> > > ```
> > 
> > The command line looks similar with mine, just in my tests:
> > 
> > taskset -c 0-31 fio/t/io_uring -p0 -n 16 -r 20 /dev/ublkb0
> > 
> > so the test is run cpu 0~31, which covers all 8 numa node.
> 
> Oh, yes, this is a difference.
> 
> > 
> > Also what is the single job perf result on your setting?
> > 
> > /home/haolee/fio/t/io_uring -p0 -n 1 -r 20 /dev/ublkb0
> 
> If I use this command without taskset, the IOPS is still 900K...

So single job(-n 1) can reach 900K, which is not bad.

But if 16 jobs still can reach 1M, which looks not good.

In my machine, single job can reach 2.7M, 16jobs(taskset -c 0-31) can get 13M
on v7.0-rc3.


> 
> > 
> > > 
> > > Below are my machine numa info. Could there be something configured incorrectly
> > > on my side?
> > > 
> > > available: 8 nodes (0-7)
> > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> > > node 0 size: 193175 MB
> > > node 0 free: 164227 MB
> > > node 1 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
> > > node 1 size: 0 MB
> > > node 1 free: 0 MB
> > > node 2 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
> > > node 2 size: 0 MB
> > > node 2 free: 0 MB
> > > node 3 cpus: 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
> > > node 3 size: 0 MB
> > > node 3 free: 0 MB
> > > node 4 cpus: 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
> > > node 4 size: 193434 MB
> > > node 4 free: 189559 MB
> > > node 5 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
> > > node 5 size: 0 MB
> > > node 5 free: 0 MB
> > > node 6 cpus: 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
> > > node 6 size: 0 MB
> > > node 6 free: 0 MB
> > > node 7 cpus: 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
> > > node 7 size: 0 MB
> > > node 7 free: 0 MB
> > > node distances:
> > > node   0   1   2   3   4   5   6   7
> > >   0:  10  12  12  12  32  32  32  32
> > >   1:  12  10  12  12  32  32  32  32
> > >   2:  12  12  10  12  32  32  32  32
> > >   3:  12  12  12  10  32  32  32  32
> > >   4:  32  32  32  32  10  12  12  12
> > >   5:  32  32  32  32  12  10  12  12
> > >   6:  32  32  32  32  12  12  10  12
> > >   7:  32  32  32  32  12  12  12  10
> > 
> > The nuam topo is different with mine, please see:
> > 
> > https://lore.kernel.org/all/aZ7p9uF8H8u6RxrK@fedora/
> 
> Yes, our NUMA topology does have some differences, but I feel there may be some
> other factors affecting my test results as well.
> 
> Even when I run with "-p0 -n 16 -r 20 /dev/ublkb0" without using taskset to pin
> the CPU affinity, the best performance I can get is only around 10M.

What is the data when you run same test on v6.19?
 
Thanks,
Ming

next prev parent reply	other threads:[~2026-03-12 14:51 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  2:52 [Regression] mm:slab/sheaves: severe performance regression in cross-CPU slab allocation Ming Lei
2026-02-24  5:00 ` Harry Yoo
2026-02-24  9:07   ` Ming Lei
2026-02-25  5:32     ` Hao Li
2026-02-25  6:54       ` Harry Yoo
2026-02-25  7:06         ` Hao Li
2026-02-25  7:19           ` Harry Yoo
2026-02-25  8:19             ` Hao Li
2026-02-25  8:41               ` Harry Yoo
2026-02-25  8:54                 ` Hao Li
2026-02-25  8:21             ` Harry Yoo
2026-02-24  6:51 ` Hao Li
2026-02-24  7:10   ` Harry Yoo
2026-02-24  7:41     ` Hao Li
2026-02-24 20:27 ` Vlastimil Babka
2026-02-25  5:24   ` Harry Yoo
2026-02-25  8:45   ` Vlastimil Babka (SUSE)
2026-02-25  9:31     ` Ming Lei
2026-02-25 11:29       ` Vlastimil Babka (SUSE)
2026-02-25 12:24         ` Ming Lei
2026-02-25 13:22           ` Vlastimil Babka (SUSE)
2026-02-26 18:02       ` Vlastimil Babka (SUSE)
2026-02-27  9:23         ` Ming Lei
2026-03-05 13:05           ` Vlastimil Babka (SUSE)
2026-03-05 15:48             ` Ming Lei
2026-03-06  1:01               ` Ming Lei
2026-03-06  4:17               ` Hao Li
2026-03-06  4:55         ` Harry Yoo
2026-03-06  8:32           ` Hao Li
2026-03-06  8:47           ` Vlastimil Babka (SUSE)
2026-03-06 10:22             ` Ming Lei
2026-03-11  1:10               ` Harry Yoo
2026-03-11 10:15                 ` Ming Lei
2026-03-11 10:43                   ` Ming Lei
2026-03-12  4:11                   ` Harry Yoo
2026-03-12 11:26 ` Hao Li
2026-03-12 11:56   ` Ming Lei
2026-03-12 12:13     ` Hao Li
2026-03-12 14:50       ` Ming Lei [this message]
2026-03-13  3:26         ` Hao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abLSuHU3eJrel6KI@fedora \
    --to=ming.lei@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox