* [linus:master] [ucount] b4dc0bee2a: stress-ng.set.ops_per_sec 7.5% improvement
@ 2025-04-22 8:42 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-04-22 8:42 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Paul E. McKenney,
Thomas Gleixner, Boqun Feng, Joel Fernandes, Josh Triplett,
Lai jiangshan, Mathieu Desnoyers, Mengen Sun, Steven Rostedt,
Uladzislau Rezki (Sony), YueHong Wu, Zqiang, oliver.sang
Hello,
kernel test robot noticed a 7.5% improvement of stress-ng.set.ops_per_sec on:
commit: b4dc0bee2a749083028afba346910e198653f42a ("ucount: use rcuref_t for reference counting")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: set
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250422/202504221604.38512645-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/set/stress-ng/60s
commit:
5f01a22c5b ("ucount: use RCU for ucounts lookups")
b4dc0bee2a ("ucount: use rcuref_t for reference counting")
5f01a22c5b231dd5 b4dc0bee2a749083028afba3469
---------------- ---------------------------
%stddev %change %stddev
\ | \
10.78 -1.6 9.22 mpstat.cpu.all.soft%
150.17 ± 5% -54.4% 68.50 ± 11% perf-c2c.DRAM.local
14.70 ± 13% -18.0% 12.05 ± 8% vmstat.procs.r
235759 +7.0% 252328 vmstat.system.cs
229993 -1.6% 226301 vmstat.system.in
1456 ± 3% -40.2% 870.70 ± 59% sched_debug.cfs_rq:/.avg_vruntime.min
56228 ±133% -81.2% 10577 ± 72% sched_debug.cfs_rq:/.load.avg
605295 ±186% -89.2% 65458 ± 51% sched_debug.cfs_rq:/.load.stddev
1456 ± 3% -40.2% 870.70 ± 59% sched_debug.cfs_rq:/.min_vruntime.min
259.95 ± 14% -33.5% 172.79 ± 21% sched_debug.cpu.curr->pid.avg
1122 ± 7% -22.0% 874.99 ± 15% sched_debug.cpu.curr->pid.stddev
7692146 +7.5% 8265701 stress-ng.set.ops
128199 +7.5% 137758 stress-ng.set.ops_per_sec
28263 ± 2% -34.1% 18622 ± 2% stress-ng.time.involuntary_context_switches
77524 -1.7% 76216 stress-ng.time.minor_page_faults
750.50 -3.0% 728.33 stress-ng.time.percent_of_cpu_this_job_got
416.18 -3.8% 400.28 stress-ng.time.system_time
7083512 +8.2% 7667679 stress-ng.time.voluntary_context_switches
141813 +4.2% 147721 proc-vmstat.nr_shmem
1695593 +2.9% 1745184 proc-vmstat.numa_hit
1462962 +3.4% 1512676 proc-vmstat.numa_local
99906 ± 9% -23.1% 76793 ± 17% proc-vmstat.numa_pages_migrated
321573 ± 14% -32.0% 218744 ± 22% proc-vmstat.numa_pte_updates
2547220 +3.5% 2636020 proc-vmstat.pgalloc_normal
2441293 +3.4% 2524816 proc-vmstat.pgfree
99906 ± 9% -23.1% 76793 ± 17% proc-vmstat.pgmigrate_success
0.19 ±143% -96.6% 0.01 ± 49% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.32 ±133% -72.7% 0.09 ± 16% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.39 ±135% -98.1% 0.01 ± 52% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
240.34 ±217% -98.5% 3.55 ± 6% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.02 ± 8% -19.4% 0.01 ± 8% perf-sched.total_sch_delay.average.ms
3.02 -9.3% 2.74 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
114.29 ± 2% +18.6% 135.51 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
128.00 ± 4% +14.1% 146.00 perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
18816 ± 2% -16.6% 15697 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
99.29 ±187% -98.1% 1.85 ±117% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
3.01 -9.3% 2.73 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
114.18 ± 2% +18.6% 135.41 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
195.32 ±189% -98.6% 2.64 ±124% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
2.37 -1.8% 2.33 perf-stat.i.MPKI
2.841e+09 +3.2% 2.932e+09 perf-stat.i.branch-instructions
1.47 -0.1 1.40 perf-stat.i.branch-miss-rate%
26.62 +1.0 27.64 perf-stat.i.cache-miss-rate%
1.193e+08 -2.5% 1.163e+08 perf-stat.i.cache-references
243399 +7.3% 261109 perf-stat.i.context-switches
7.74 -13.1% 6.73 perf-stat.i.cpi
1.006e+11 -10.6% 8.988e+10 perf-stat.i.cpu-cycles
4164 -7.0% 3871 perf-stat.i.cpu-migrations
3253 -11.7% 2872 perf-stat.i.cycles-between-cache-misses
1.378e+10 +3.1% 1.421e+10 perf-stat.i.instructions
0.15 +15.3% 0.17 perf-stat.i.ipc
1.03 +10.3% 1.13 perf-stat.i.metric.K/sec
2.29 -1.9% 2.25 perf-stat.overall.MPKI
1.65 -0.1 1.58 perf-stat.overall.branch-miss-rate%
26.43 +1.0 27.43 perf-stat.overall.cache-miss-rate%
7.30 -13.3% 6.33 perf-stat.overall.cpi
3189 -11.6% 2818 perf-stat.overall.cycles-between-cache-misses
0.14 +15.4% 0.16 perf-stat.overall.ipc
2.794e+09 +3.2% 2.883e+09 perf-stat.ps.branch-instructions
1.174e+08 -2.6% 1.144e+08 perf-stat.ps.cache-references
239347 +7.3% 256763 perf-stat.ps.context-switches
9.894e+10 -10.6% 8.843e+10 perf-stat.ps.cpu-cycles
4096 -7.0% 3808 perf-stat.ps.cpu-migrations
1.355e+10 +3.1% 1.398e+10 perf-stat.ps.instructions
8.25e+11 +3.2% 8.511e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2025-04-22 8:42 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22 8:42 [linus:master] [ucount] b4dc0bee2a: stress-ng.set.ops_per_sec 7.5% improvement kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox