public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [ucount]  b4dc0bee2a:  stress-ng.set.ops_per_sec 7.5% improvement
@ 2025-04-22  8:42 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-04-22  8:42 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Paul E. McKenney,
	Thomas Gleixner, Boqun Feng, Joel Fernandes, Josh Triplett,
	Lai jiangshan, Mathieu Desnoyers, Mengen Sun, Steven Rostedt,
	Uladzislau Rezki (Sony), YueHong Wu, Zqiang, oliver.sang



Hello,

kernel test robot noticed a 7.5% improvement of stress-ng.set.ops_per_sec on:


commit: b4dc0bee2a749083028afba346910e198653f42a ("ucount: use rcuref_t for reference counting")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: set
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250422/202504221604.38512645-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/set/stress-ng/60s

commit: 
  5f01a22c5b ("ucount: use RCU for ucounts lookups")
  b4dc0bee2a ("ucount: use rcuref_t for reference counting")

5f01a22c5b231dd5 b4dc0bee2a749083028afba3469 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     10.78            -1.6        9.22        mpstat.cpu.all.soft%
    150.17 ±  5%     -54.4%      68.50 ± 11%  perf-c2c.DRAM.local
     14.70 ± 13%     -18.0%      12.05 ±  8%  vmstat.procs.r
    235759            +7.0%     252328        vmstat.system.cs
    229993            -1.6%     226301        vmstat.system.in
      1456 ±  3%     -40.2%     870.70 ± 59%  sched_debug.cfs_rq:/.avg_vruntime.min
     56228 ±133%     -81.2%      10577 ± 72%  sched_debug.cfs_rq:/.load.avg
    605295 ±186%     -89.2%      65458 ± 51%  sched_debug.cfs_rq:/.load.stddev
      1456 ±  3%     -40.2%     870.70 ± 59%  sched_debug.cfs_rq:/.min_vruntime.min
    259.95 ± 14%     -33.5%     172.79 ± 21%  sched_debug.cpu.curr->pid.avg
      1122 ±  7%     -22.0%     874.99 ± 15%  sched_debug.cpu.curr->pid.stddev
   7692146            +7.5%    8265701        stress-ng.set.ops
    128199            +7.5%     137758        stress-ng.set.ops_per_sec
     28263 ±  2%     -34.1%      18622 ±  2%  stress-ng.time.involuntary_context_switches
     77524            -1.7%      76216        stress-ng.time.minor_page_faults
    750.50            -3.0%     728.33        stress-ng.time.percent_of_cpu_this_job_got
    416.18            -3.8%     400.28        stress-ng.time.system_time
   7083512            +8.2%    7667679        stress-ng.time.voluntary_context_switches
    141813            +4.2%     147721        proc-vmstat.nr_shmem
   1695593            +2.9%    1745184        proc-vmstat.numa_hit
   1462962            +3.4%    1512676        proc-vmstat.numa_local
     99906 ±  9%     -23.1%      76793 ± 17%  proc-vmstat.numa_pages_migrated
    321573 ± 14%     -32.0%     218744 ± 22%  proc-vmstat.numa_pte_updates
   2547220            +3.5%    2636020        proc-vmstat.pgalloc_normal
   2441293            +3.4%    2524816        proc-vmstat.pgfree
     99906 ±  9%     -23.1%      76793 ± 17%  proc-vmstat.pgmigrate_success
      0.19 ±143%     -96.6%       0.01 ± 49%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.32 ±133%     -72.7%       0.09 ± 16%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.39 ±135%     -98.1%       0.01 ± 52%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    240.34 ±217%     -98.5%       3.55 ±  6%  perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.02 ±  8%     -19.4%       0.01 ±  8%  perf-sched.total_sch_delay.average.ms
      3.02            -9.3%       2.74        perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
    114.29 ±  2%     +18.6%     135.51 ±  2%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    128.00 ±  4%     +14.1%     146.00        perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
     18816 ±  2%     -16.6%      15697 ±  2%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     99.29 ±187%     -98.1%       1.85 ±117%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      3.01            -9.3%       2.73        perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
    114.18 ±  2%     +18.6%     135.41 ±  2%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    195.32 ±189%     -98.6%       2.64 ±124%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      2.37            -1.8%       2.33        perf-stat.i.MPKI
 2.841e+09            +3.2%  2.932e+09        perf-stat.i.branch-instructions
      1.47            -0.1        1.40        perf-stat.i.branch-miss-rate%
     26.62            +1.0       27.64        perf-stat.i.cache-miss-rate%
 1.193e+08            -2.5%  1.163e+08        perf-stat.i.cache-references
    243399            +7.3%     261109        perf-stat.i.context-switches
      7.74           -13.1%       6.73        perf-stat.i.cpi
 1.006e+11           -10.6%  8.988e+10        perf-stat.i.cpu-cycles
      4164            -7.0%       3871        perf-stat.i.cpu-migrations
      3253           -11.7%       2872        perf-stat.i.cycles-between-cache-misses
 1.378e+10            +3.1%  1.421e+10        perf-stat.i.instructions
      0.15           +15.3%       0.17        perf-stat.i.ipc
      1.03           +10.3%       1.13        perf-stat.i.metric.K/sec
      2.29            -1.9%       2.25        perf-stat.overall.MPKI
      1.65            -0.1        1.58        perf-stat.overall.branch-miss-rate%
     26.43            +1.0       27.43        perf-stat.overall.cache-miss-rate%
      7.30           -13.3%       6.33        perf-stat.overall.cpi
      3189           -11.6%       2818        perf-stat.overall.cycles-between-cache-misses
      0.14           +15.4%       0.16        perf-stat.overall.ipc
 2.794e+09            +3.2%  2.883e+09        perf-stat.ps.branch-instructions
 1.174e+08            -2.6%  1.144e+08        perf-stat.ps.cache-references
    239347            +7.3%     256763        perf-stat.ps.context-switches
 9.894e+10           -10.6%  8.843e+10        perf-stat.ps.cpu-cycles
      4096            -7.0%       3808        perf-stat.ps.cpu-migrations
 1.355e+10            +3.1%  1.398e+10        perf-stat.ps.instructions
  8.25e+11            +3.2%  8.511e+11        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-04-22  8:42 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22  8:42 [linus:master] [ucount] b4dc0bee2a: stress-ng.set.ops_per_sec 7.5% improvement kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox