From: kernel test robot <oliver.sang@intel.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<io-uring@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [axboe-block:io_uring-defer-tw.4] [io_uring] 61a5e20297: stress-ng.io-uring.ops_per_sec 41.9% regression
Date: Tue, 1 Jul 2025 12:47:55 +0800 [thread overview]
Message-ID: <202507010550.2d6f83ea-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 41.9% regression of stress-ng.io-uring.ops_per_sec on:
commit: 61a5e202971d4a242fc761728e89922edde02d38 ("io_uring: switch defer task_work to using a ring")
https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git io_uring-defer-tw.4
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: io-uring
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202507010550.2d6f83ea-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250701/202507010550.2d6f83ea-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp2/io-uring/stress-ng/60s
commit:
8559f3b41f ("io_uring: make task_work pending check dependent on ring type")
61a5e20297 ("io_uring: switch defer task_work to using a ring")
8559f3b41fdcdd01 61a5e202971d4a242fc761728e8
---------------- ---------------------------
%stddev %change %stddev
\ | \
1022268 ± 2% -30.4% 711175 meminfo.Mapped
7.478e+09 +30.1% 9.727e+09 cpuidle..time
3.03e+08 -20.9% 2.398e+08 ± 3% cpuidle..usage
696425 ±171% +181.2% 1958387 ± 81% numa-meminfo.node0.Unevictable
940879 ± 10% -32.9% 631792 ± 14% numa-meminfo.node1.Mapped
43.50 ± 20% -73.9% 11.33 ± 53% perf-c2c.DRAM.local
32749 ± 10% -86.3% 4475 ± 29% perf-c2c.HITM.local
33251 ± 10% -85.0% 4989 ± 25% perf-c2c.HITM.total
14632245 ± 9% -38.5% 8999074 ± 7% numa-numastat.node0.local_node
14749610 ± 9% -38.6% 9056826 ± 6% numa-numastat.node0.numa_hit
21106190 ± 4% -37.5% 13198942 ± 4% numa-numastat.node1.local_node
21186924 ± 4% -37.0% 13339356 ± 4% numa-numastat.node1.numa_hit
43.02 ± 2% -12.5% 37.66 ± 2% vmstat.cpu.id
19.87 +121.8% 44.07 ± 2% vmstat.cpu.wa
73.14 +101.7% 147.54 ± 2% vmstat.procs.b
112.33 ± 2% -64.8% 39.60 ± 7% vmstat.procs.r
12695197 -38.2% 7849636 ± 4% vmstat.system.cs
5179340 ± 2% -24.4% 3915343 ± 4% vmstat.system.in
174059 ±171% +181.3% 489607 ± 81% numa-vmstat.node0.nr_unevictable
174060 ±171% +181.3% 489607 ± 81% numa-vmstat.node0.nr_zone_unevictable
14750003 ± 9% -38.6% 9057006 ± 6% numa-vmstat.node0.numa_hit
14632638 ± 9% -38.5% 8999253 ± 7% numa-vmstat.node0.numa_local
236391 ± 10% -33.3% 157713 ± 14% numa-vmstat.node1.nr_mapped
21186186 ± 4% -37.0% 13338387 ± 4% numa-vmstat.node1.numa_hit
21105453 ± 4% -37.5% 13197958 ± 4% numa-vmstat.node1.numa_local
41.57 -5.8 35.76 ± 2% mpstat.cpu.all.idle%
20.32 +25.1 45.43 ± 2% mpstat.cpu.all.iowait%
6.25 ± 4% -2.2 4.09 ± 6% mpstat.cpu.all.irq%
0.34 ± 4% -0.2 0.14 ± 6% mpstat.cpu.all.soft%
28.91 -15.5 13.40 ± 6% mpstat.cpu.all.sys%
2.62 -1.4 1.17 ± 6% mpstat.cpu.all.usr%
18.83 ± 5% -84.1% 3.00 mpstat.max_utilization.seconds
61.41 -30.1% 42.94 mpstat.max_utilization_pct
3.455e+08 -41.9% 2.006e+08 ± 4% stress-ng.io-uring.ops
5758736 -41.9% 3343243 ± 4% stress-ng.io-uring.ops_per_sec
63485668 -85.7% 9052788 ± 15% stress-ng.time.involuntary_context_switches
86971 -2.2% 85030 stress-ng.time.minor_page_faults
6021 -54.8% 2724 ± 6% stress-ng.time.percent_of_cpu_this_job_got
3383 -53.8% 1562 ± 6% stress-ng.time.system_time
248.17 -67.3% 81.18 ± 9% stress-ng.time.user_time
4.227e+08 -40.1% 2.531e+08 ± 4% stress-ng.time.voluntary_context_switches
2888857 ± 2% -8.1% 2654260 proc-vmstat.nr_active_anon
302955 -3.1% 293576 proc-vmstat.nr_anon_pages
3475920 ± 2% -6.5% 3250878 proc-vmstat.nr_file_pages
44207 -3.1% 42858 proc-vmstat.nr_kernel_stack
255933 ± 3% -30.6% 177546 proc-vmstat.nr_mapped
2586684 ± 3% -8.7% 2361525 proc-vmstat.nr_shmem
43152 -1.5% 42518 proc-vmstat.nr_slab_reclaimable
2888857 ± 2% -8.1% 2654260 proc-vmstat.nr_zone_active_anon
35939101 -37.7% 22399100 ± 3% proc-vmstat.numa_hit
35741003 -37.9% 22200912 ± 3% proc-vmstat.numa_local
585759 ± 5% -27.5% 424436 ± 8% proc-vmstat.numa_pte_updates
36196152 -37.5% 22624491 ± 3% proc-vmstat.pgalloc_normal
700860 ± 3% -7.0% 651538 ± 4% proc-vmstat.pgfault
32134448 -41.1% 18939637 ± 4% proc-vmstat.pgfree
16707904 -77.5% 3755057 ± 10% proc-vmstat.unevictable_pgs_culled
0.17 ± 4% +94.3% 0.32 ± 16% perf-stat.i.MPKI
2.698e+10 -40.1% 1.616e+10 ± 4% perf-stat.i.branch-instructions
0.92 -0.3 0.64 perf-stat.i.branch-miss-rate%
2.173e+08 -57.1% 93142321 ± 5% perf-stat.i.branch-misses
2.25 ± 4% +6.4 8.67 ± 17% perf-stat.i.cache-miss-rate%
1.262e+09 -68.8% 3.94e+08 ± 6% perf-stat.i.cache-references
13218006 -37.6% 8252620 ± 4% perf-stat.i.context-switches
3.40 -7.5% 3.15 ± 3% perf-stat.i.cpi
4.003e+11 -40.4% 2.384e+11 ± 5% perf-stat.i.cpu-cycles
5382764 -76.2% 1281759 ± 10% perf-stat.i.cpu-migrations
32980 ± 5% -25.9% 24437 ± 9% perf-stat.i.cycles-between-cache-misses
1.327e+11 -39.9% 7.973e+10 ± 4% perf-stat.i.instructions
0.33 +9.9% 0.36 ± 3% perf-stat.i.ipc
96.88 -48.8% 49.64 ± 4% perf-stat.i.metric.K/sec
8872 ± 4% -11.6% 7844 ± 4% perf-stat.i.minor-faults
8872 ± 4% -11.6% 7844 ± 4% perf-stat.i.page-faults
0.18 ± 3% +61.7% 0.29 ± 8% perf-stat.overall.MPKI
0.81 -0.2 0.58 perf-stat.overall.branch-miss-rate%
1.88 ± 3% +4.0 5.86 ± 9% perf-stat.overall.cache-miss-rate%
16903 ± 3% -38.3% 10426 ± 9% perf-stat.overall.cycles-between-cache-misses
2.655e+10 -40.1% 1.59e+10 ± 4% perf-stat.ps.branch-instructions
2.138e+08 -57.2% 91585587 ± 5% perf-stat.ps.branch-misses
1.241e+09 -68.8% 3.875e+08 ± 6% perf-stat.ps.cache-references
13003285 -37.6% 8120099 ± 4% perf-stat.ps.context-switches
3.938e+11 -40.5% 2.345e+11 ± 5% perf-stat.ps.cpu-cycles
5295095 -76.2% 1259803 ± 10% perf-stat.ps.cpu-migrations
1.306e+11 -39.9% 7.846e+10 ± 4% perf-stat.ps.instructions
8714 ± 4% -11.7% 7694 ± 4% perf-stat.ps.minor-faults
8714 ± 4% -11.7% 7694 ± 4% perf-stat.ps.page-faults
8.049e+12 -40.0% 4.829e+12 ± 4% perf-stat.total.instructions
879267 ± 3% -77.4% 198767 ± 46% sched_debug.cfs_rq:/.avg_vruntime.avg
2197261 ± 7% -80.3% 433455 ± 40% sched_debug.cfs_rq:/.avg_vruntime.max
702597 ± 3% -82.0% 126663 ± 48% sched_debug.cfs_rq:/.avg_vruntime.min
144651 ± 9% -75.7% 35081 ± 36% sched_debug.cfs_rq:/.avg_vruntime.stddev
0.38 ± 7% -79.6% 0.08 ± 20% sched_debug.cfs_rq:/.h_nr_queued.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cfs_rq:/.h_nr_queued.max
0.61 ± 4% -57.2% 0.26 ± 9% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.34 ± 6% -77.5% 0.08 ± 19% sched_debug.cfs_rq:/.h_nr_runnable.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cfs_rq:/.h_nr_runnable.max
0.56 ± 5% -53.3% 0.26 ± 9% sched_debug.cfs_rq:/.h_nr_runnable.stddev
115895 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.left_deadline.avg
1148129 ± 31% -77.7% 255916 ± 52% sched_debug.cfs_rq:/.left_deadline.max
300169 ± 8% -87.0% 39025 ± 54% sched_debug.cfs_rq:/.left_deadline.stddev
115876 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.left_vruntime.avg
1147975 ± 31% -77.7% 255883 ± 52% sched_debug.cfs_rq:/.left_vruntime.max
300120 ± 8% -87.0% 39021 ± 54% sched_debug.cfs_rq:/.left_vruntime.stddev
2.08 ± 16% -100.0% 0.00 sched_debug.cfs_rq:/.load_avg.min
879267 ± 3% -77.4% 198767 ± 46% sched_debug.cfs_rq:/.min_vruntime.avg
2197261 ± 7% -80.3% 433455 ± 40% sched_debug.cfs_rq:/.min_vruntime.max
702597 ± 3% -82.0% 126663 ± 48% sched_debug.cfs_rq:/.min_vruntime.min
144651 ± 9% -75.7% 35081 ± 36% sched_debug.cfs_rq:/.min_vruntime.stddev
0.24 ± 5% -67.9% 0.08 ± 19% sched_debug.cfs_rq:/.nr_queued.avg
0.36 -27.4% 0.26 ± 9% sched_debug.cfs_rq:/.nr_queued.stddev
115876 ± 14% -93.3% 7740 ± 69% sched_debug.cfs_rq:/.right_vruntime.avg
1147975 ± 31% -77.7% 255883 ± 52% sched_debug.cfs_rq:/.right_vruntime.max
300120 ± 8% -87.0% 39021 ± 54% sched_debug.cfs_rq:/.right_vruntime.stddev
293.31 ± 2% -61.0% 114.35 ± 10% sched_debug.cfs_rq:/.runnable_avg.avg
114.75 ± 6% -100.0% 0.00 sched_debug.cfs_rq:/.runnable_avg.min
161.40 ± 3% +16.8% 188.44 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev
243.06 ± 2% -53.0% 114.20 ± 10% sched_debug.cfs_rq:/.util_avg.avg
111.42 ± 5% -100.0% 0.00 sched_debug.cfs_rq:/.util_avg.min
143.53 ± 4% +31.2% 188.36 ± 6% sched_debug.cfs_rq:/.util_avg.stddev
45.14 ± 5% -53.8% 20.87 ± 29% sched_debug.cfs_rq:/.util_est.avg
117.16 ± 9% -23.3% 89.81 ± 15% sched_debug.cfs_rq:/.util_est.stddev
460889 +78.9% 824600 ± 4% sched_debug.cpu.avg_idle.avg
545161 ± 4% +83.4% 1000000 sched_debug.cpu.avg_idle.max
7815 ± 7% -47.7% 4084 ± 14% sched_debug.cpu.avg_idle.min
96234 ± 8% +192.4% 281404 ± 13% sched_debug.cpu.avg_idle.stddev
754.64 ± 5% -19.2% 609.61 ± 9% sched_debug.cpu.clock_task.stddev
1016 ± 7% -74.3% 261.72 ± 25% sched_debug.cpu.curr->pid.avg
1648 -37.6% 1027 ± 14% sched_debug.cpu.curr->pid.stddev
0.00 ± 24% -27.7% 0.00 ± 10% sched_debug.cpu.next_balance.stddev
0.35 ± 10% -82.5% 0.06 ± 20% sched_debug.cpu.nr_running.avg
2.92 ± 20% -65.7% 1.00 sched_debug.cpu.nr_running.max
0.60 ± 6% -61.0% 0.23 ± 8% sched_debug.cpu.nr_running.stddev
2060126 -47.9% 1073009 ± 44% sched_debug.cpu.nr_switches.avg
2688437 -31.6% 1839609 ± 44% sched_debug.cpu.nr_switches.max
650892 ± 9% -43.0% 370926 ± 54% sched_debug.cpu.nr_switches.min
522908 ± 2% -49.9% 261974 ± 45% sched_debug.cpu.nr_switches.stddev
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-07-01 4:48 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202507010550.2d6f83ea-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.