All of lore.kernel.org
 help / color / mirror / Atom feed
* [viro-vfs:work.d_revalidate] [dcache]  077ab1260a: will-it-scale.per_process_ops 1.9% improvement
@ 2025-01-10  3:14 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-01-10  3:14 UTC (permalink / raw)
  To: Al Viro; +Cc: oe-lkp, lkp, linux-fsdevel, linux-kernel, oliver.sang



Hello,

kernel test robot noticed a 1.9% improvement of will-it-scale.per_process_ops on:


commit: 077ab1260a52068a62a5fb08fa2c5f1d0dcf2738 ("dcache: back inline names with a struct-wrapped array of unsigned long")
https://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git work.d_revalidate

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: process
	test: poll2
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250110/202501101058.cd8beeba-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale

commit: 
  cf0cc84299 ("make sure that DNAME_INLINE_LEN is a multiple of word size")
  077ab1260a ("dcache: back inline names with a struct-wrapped array of unsigned long")

cf0cc842995ca3da 077ab1260a52068a62a5fb08fa2 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    294.00 ± 10%     +15.2%     338.67 ±  5%  perf-c2c.DRAM.remote
    243.33 ±  9%     +13.7%     276.67 ±  6%  perf-c2c.HITM.remote
     21502 ±  5%    +413.7%     110453 ±117%  sched_debug.cfs_rq:/.load.max
      2543 ±  6%    +336.8%      11109 ±111%  sched_debug.cfs_rq:/.load.stddev
    274.83 ± 19%     +28.8%     353.86 ±  6%  sched_debug.cfs_rq:/.util_est.min
  24387540            +1.9%   24841387        will-it-scale.104.processes
    234495            +1.9%     238859        will-it-scale.per_process_ops
  24387540            +1.9%   24841387        will-it-scale.workload
      0.85 ± 11%     -20.5%       0.68 ± 10%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
      1.71 ± 11%     -20.6%       1.36 ± 10%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
     38.41 ±104%     -78.0%       8.46        perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      3676 ± 13%     -34.3%       2415 ± 21%  perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.85 ± 11%     -20.5%       0.68 ± 10%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64
      3676 ± 13%     -34.3%       2415 ± 21%  perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
 4.591e+10            +1.9%  4.676e+10        perf-stat.i.branch-instructions
 1.367e+08            +1.9%  1.392e+08        perf-stat.i.branch-misses
      1.08            -1.9%       1.06        perf-stat.i.cpi
 2.584e+11            +1.9%  2.632e+11        perf-stat.i.instructions
      0.92            +1.9%       0.94        perf-stat.i.ipc
      1.08            -1.8%       1.06        perf-stat.overall.cpi
      0.93            +1.9%       0.94        perf-stat.overall.ipc
 4.575e+10            +1.9%   4.66e+10        perf-stat.ps.branch-instructions
 1.362e+08            +1.9%  1.388e+08        perf-stat.ps.branch-misses
 2.575e+11            +1.9%  2.623e+11        perf-stat.ps.instructions
 7.785e+13            +1.9%   7.93e+13        perf-stat.total.instructions
     59.17            -1.5       57.63        perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     71.18            -1.4       69.76        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     70.73            -1.4       69.32        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     72.76            -1.3       71.48        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     76.80            -1.1       75.70        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
     43.66            -1.1       42.61        perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
     94.61            -0.2       94.40        perf-profile.calltrace.cycles-pp.__poll
      0.92            +0.0        0.94        perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.66            +0.1        2.73        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll
      4.90            +0.2        5.10        perf-profile.calltrace.cycles-pp.testcase
      5.81            +0.2        6.04        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll
      1.98 ±  3%      +0.3        2.26 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll
      7.25            +0.3        7.56        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll
     59.29            -1.6       57.72        perf-profile.children.cycles-pp.do_poll
     71.24            -1.4       69.83        perf-profile.children.cycles-pp.__x64_sys_poll
     70.82            -1.4       69.41        perf-profile.children.cycles-pp.do_sys_poll
     72.83            -1.3       71.55        perf-profile.children.cycles-pp.do_syscall_64
     76.94            -1.1       75.84        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     43.57            -1.0       42.53        perf-profile.children.cycles-pp.fdget
     95.18            -0.2       94.97        perf-profile.children.cycles-pp.__poll
      1.16 ±  2%      +0.2        1.32 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      3.50            +0.2        3.69        perf-profile.children.cycles-pp.entry_SYSCALL_64
      4.91            +0.2        5.12        perf-profile.children.cycles-pp.testcase
      6.22            +0.2        6.46        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      7.31            +0.3        7.62        perf-profile.children.cycles-pp.syscall_return_via_sysret
     42.16            -1.0       41.16        perf-profile.self.cycles-pp.fdget
     16.86            -0.6       16.30        perf-profile.self.cycles-pp.do_poll
      0.90            +0.0        0.93        perf-profile.self.cycles-pp.kfree
      0.32 ±  2%      +0.0        0.36 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.20 ±  3%      +0.1        1.32 ±  2%  perf-profile.self.cycles-pp.__poll
      0.76 ±  2%      +0.1        0.89 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      4.88            +0.1        5.00        perf-profile.self.cycles-pp.do_sys_poll
      3.10            +0.2        3.28        perf-profile.self.cycles-pp.entry_SYSCALL_64
      4.18            +0.2        4.37        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      4.73            +0.2        4.94        perf-profile.self.cycles-pp.testcase
      6.16            +0.2        6.40        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      7.30            +0.3        7.62        perf-profile.self.cycles-pp.syscall_return_via_sysret




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-01-10  3:15 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-10  3:14 [viro-vfs:work.d_revalidate] [dcache] 077ab1260a: will-it-scale.per_process_ops 1.9% improvement kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.