public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [linux-next:master] [mm]  7b6218ae12: stress-ng.forkheavy.ops_per_sec 5.0% improvement
@ 2025-03-31 13:24 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-03-31 13:24 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: oe-lkp, lkp, Andrew Morton, Lorenzo Stoakes, Shakeel Butt,
	Vlastimil Babka, Liam R. Howlett, Shivank Garg, Christian Brauner,
	David Hildenbrand, David Howells, Davidlohr Bueso, Hugh Dickins,
	Jann Horn, Johannes Weiner, Jonathan Corbet, Klara Modin,
	Lokesh Gidra, Mateusz Guzik, Matthew Wilcox, Mel Gorman,
	Michal Hocko, Minchan Kim, Oleg Nesterov, Pasha Tatashin,
	Paul E . McKenney, Peter Xu, Peter Zijlstra, Sourav Panda,
	Wei Yang, Will Deacon, Heiko Carstens, Stephen Rothwell, linux-mm,
	linux-kernel, oliver.sang



Hello,

kernel test robot noticed a 5.0% improvement of stress-ng.forkheavy.ops_per_sec on:


commit: 7b6218ae1253491d56f21f4b1f3609f3dd873600 ("mm: move per-vma lock into vm_area_struct")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) Platinum 8468V  CPU @ 2.4GHz (Sapphire Rapids) with 384G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: forkheavy
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250331/202503311656.e3596aaf-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/igk-spr-2sp1/forkheavy/stress-ng/60s

commit: 
  b2ae5fccb8 ("mm: introduce vma_start_read_locked{_nested} helpers")
  7b6218ae12 ("mm: move per-vma lock into vm_area_struct")

b2ae5fccb8c0ec21 7b6218ae1253491d56f21f4b1f3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    382800 ±  4%     +10.2%     421797 ±  5%  numa-meminfo.node1.AnonHugePages
     32850            +5.0%      34492        stress-ng.forkheavy.ops
    493.66            +5.0%     518.50        stress-ng.forkheavy.ops_per_sec
     40.74 ± 30%     +68.2%      68.53 ± 23%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
     73.19 ± 42%     +52.2%     111.39 ± 16%  sched_debug.cfs_rq:/.util_est.avg
    222.12 ± 29%     +34.4%     298.62 ± 10%  sched_debug.cfs_rq:/.util_est.stddev
      4555 ± 10%     -45.3%       2491 ± 27%  perf-c2c.DRAM.local
     11750 ±  4%     -22.7%       9082 ± 22%  perf-c2c.HITM.local
      2592 ±  6%     -45.4%       1414 ± 23%  perf-c2c.HITM.remote
     14342 ±  4%     -26.8%      10497 ± 22%  perf-c2c.HITM.total
  41336771            -4.4%   39526485        proc-vmstat.numa_hit
  41134683            -4.4%   39326465        proc-vmstat.numa_local
  71479761            +1.8%   72742225        proc-vmstat.pgalloc_normal
   3480841            +2.4%    3564757        proc-vmstat.pgfault
  71044889            +1.7%   72274310        proc-vmstat.pgfree
      1.47 ± 86%     -73.5%       0.39 ±138%  perf-sched.sch_delay.avg.ms.__cond_resched.do_ftruncate.do_sys_ftruncate.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.33 ±108%    +205.7%       1.00 ± 83%  perf-sched.sch_delay.avg.ms.__cond_resched.down_write.do_mq_open.__x64_sys_mq_open.do_syscall_64
      0.77 ± 25%     +43.6%       1.10 ± 21%  perf-sched.sch_delay.avg.ms.__cond_resched.dput.vfs_tmpfile.path_openat.do_filp_open
      0.16 ± 17%     +44.7%       0.23 ± 26%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.58 ± 85%     -85.8%       0.08 ±130%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      3.92 ± 72%     -80.6%       0.76 ±198%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      6.96 ± 55%    +113.7%      14.88 ± 28%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     62.68 ± 72%    +129.9%     144.11 ±  9%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    334.97 ± 57%     -66.4%     112.42 ± 70%  perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
     82.80 ± 23%     +73.9%     143.96 ±  9%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      2.15 ± 43%     -72.2%       0.60 ± 94%  perf-sched.wait_time.avg.ms.__cond_resched.unmap_vmas.unmap_region.__mmap_new_vma.__mmap_region
     68.44 ±135%    +288.8%     266.12 ±121%  perf-sched.wait_time.max.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap_unlocked.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
     15.31            +8.9%      16.67 ±  2%  perf-stat.i.MPKI
 1.684e+10            -3.9%  1.618e+10        perf-stat.i.branch-instructions
  75533943            -4.7%   72015903        perf-stat.i.branch-misses
      6.71            +5.6%       7.09        perf-stat.i.cpi
  8.19e+10            -5.7%  7.726e+10        perf-stat.i.instructions
      0.16            -4.9%       0.15        perf-stat.i.ipc
     16.72            +7.0%      17.90        perf-stat.overall.MPKI
      6.53            +6.2%       6.94        perf-stat.overall.cpi
      0.15            -5.9%       0.14        perf-stat.overall.ipc
  1.66e+10            -4.2%   1.59e+10        perf-stat.ps.branch-instructions
  73765712            -5.4%   69811938        perf-stat.ps.branch-misses
 8.092e+10            -5.9%  7.612e+10        perf-stat.ps.instructions
  5.53e+12            -5.5%  5.227e+12        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki




^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-03-31 15:24 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-31 13:24 [linux-next:master] [mm] 7b6218ae12: stress-ng.forkheavy.ops_per_sec 5.0% improvement kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox