public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [timekeeping]  ee3283c608: will-it-scale.per_process_ops 4.8% regression
@ 2025-01-26  8:25 kernel test robot
  2025-01-26 12:23 ` Jeff Layton
  0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2025-01-26  8:25 UTC (permalink / raw)
  To: Jeff Layton
  Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Thomas Gleixner,
	John Stultz, oliver.sang


hi, Jeff Layton,


we make out below report just FYI since the results is stable in our tests.
we don't have enough knowledge if this regression is due to align.

+static __cacheline_aligned_in_smp atomic64_t mg_floor;

if low value, please just ignore. thanks a lot.


Hello,

kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:


commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
[test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: process
	test: pwrite1
	cpufreq_governor: performance


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale

commit: 
  v6.12-rc2
  ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")

       v6.12-rc2 ee3283c608dfa21251b0821d7bb 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  57550068            -4.8%   54794800        will-it-scale.104.processes
    553365            -4.8%     526872        will-it-scale.per_process_ops
  57550068            -4.8%   54794800        will-it-scale.workload
     43.00 ± 27%     -60.0%      17.20 ± 27%  perf-c2c.DRAM.local
    251.20 ± 23%     -57.5%     106.80 ± 16%  perf-c2c.DRAM.remote
    520.00 ± 33%     -70.3%     154.20 ± 13%  perf-c2c.HITM.local
    218.50 ± 25%     -55.2%      97.80 ± 18%  perf-c2c.HITM.remote
      0.03 ± 14%     +48.4%       0.04 ±  9%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      4.18 ±  4%     +21.5%       5.08        perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
    653.70 ±  5%     +50.5%     983.70 ±  7%  perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
    913.40 ±  6%     -24.8%     686.80 ±  7%  perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
      1.29 ± 81%  +42618.3%     552.09 ± 74%  perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      2.58 ± 81%  +65403.1%       1692 ± 72%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
 1.721e+10            -4.8%  1.639e+10        perf-stat.i.branch-instructions
      1.66            +0.1        1.72        perf-stat.i.branch-miss-rate%
 2.852e+08            -1.2%  2.818e+08        perf-stat.i.branch-misses
      3.29            +4.9%       3.45        perf-stat.i.cpi
 8.743e+10            -4.8%  8.327e+10        perf-stat.i.instructions
      0.30            -4.7%       0.29        perf-stat.i.ipc
      1.66            +0.1        1.72        perf-stat.overall.branch-miss-rate%
      3.29            +4.9%       3.45        perf-stat.overall.cpi
      0.30            -4.7%       0.29        perf-stat.overall.ipc
 1.715e+10            -4.8%  1.634e+10        perf-stat.ps.branch-instructions
 2.842e+08            -1.2%  2.809e+08        perf-stat.ps.branch-misses
 8.714e+10            -4.8%    8.3e+10        perf-stat.ps.instructions
 2.632e+13            -4.7%  2.508e+13        perf-stat.total.instructions
     10.62            -4.8        5.81        perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
      8.89 ±  2%      -4.6        4.25        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
      5.98 ±  3%      -4.2        1.79 ±  2%  perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
     13.24            -1.4       11.88        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
     16.62            -1.2       15.42        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
      2.90            -1.2        1.74        perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      2.38 ±  2%      -0.9        1.44        perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
      1.68 ±  2%      -0.9        0.79        perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
      1.42 ± 13%      -0.8        0.64 ±  3%  perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      5.69            -0.7        4.99 ±  2%  perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
      6.91            -0.4        6.53        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
      1.23 ±  2%      -0.2        1.01        perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      1.41            -0.2        1.26        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      0.87            -0.1        0.79 ±  2%  perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      0.79 ±  2%      -0.1        0.74        perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
      1.15 ±  2%      +0.1        1.26 ±  2%  perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      0.54            +0.2        0.73        perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
      0.82 ±  2%      +0.4        1.26 ±  5%  perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
      0.00            +0.7        0.67        perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
      2.10            +1.2        3.35        perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
      2.36            +1.3        3.69        perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
     46.08            +2.8       48.91        perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     43.76            +3.3       47.02        perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     58.89            +3.4       62.32        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     38.55            +3.5       42.07        perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
     49.37            +3.7       53.09        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     29.41            +5.6       34.99        perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      4.60            +7.7       12.30        perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
      6.68           +10.3       16.96        perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
     10.69            -4.8        5.86        perf-profile.children.cycles-pp.shmem_write_begin
      8.99 ±  2%      -4.6        4.35        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      6.00 ±  3%      -4.2        1.81 ±  2%  perf-profile.children.cycles-pp.filemap_get_entry
     14.20            -1.4       12.77        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.62 ±  9%      -1.3        0.37 ±  5%  perf-profile.children.cycles-pp.xas_load
     16.76            -1.2       15.54        perf-profile.children.cycles-pp.syscall_return_via_sysret
      2.96            -1.2        1.79        perf-profile.children.cycles-pp.file_update_time
      2.47 ±  2%      -1.0        1.51        perf-profile.children.cycles-pp.inode_needs_update_time
      1.69 ±  2%      -0.9        0.79        perf-profile.children.cycles-pp.folio_unlock
      1.44 ± 13%      -0.8        0.65 ±  3%  perf-profile.children.cycles-pp.file_remove_privs_flags
      5.94            -0.7        5.24 ±  2%  perf-profile.children.cycles-pp.shmem_write_end
      7.17            -0.5        6.67        perf-profile.children.cycles-pp.entry_SYSCALL_64
      1.77            -0.4        1.42        perf-profile.children.cycles-pp.__cond_resched
      0.67 ±  3%      -0.3        0.41        perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      1.68 ±  9%      -0.2        1.42 ±  4%  perf-profile.children.cycles-pp.generic_write_checks
      1.25            -0.2        1.03        perf-profile.children.cycles-pp.fdget
      1.44            -0.2        1.28        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.38 ±  3%      -0.1        0.27 ±  2%  perf-profile.children.cycles-pp.timestamp_truncate
      0.37 ±  4%      -0.1        0.26        perf-profile.children.cycles-pp.rw_verify_area
      0.69 ±  3%      -0.1        0.60        perf-profile.children.cycles-pp.rcu_all_qs
      0.90            -0.1        0.82 ±  2%  perf-profile.children.cycles-pp.up_write
      0.23 ±  5%      -0.1        0.16 ±  2%  perf-profile.children.cycles-pp.xas_start
      0.85            -0.1        0.80        perf-profile.children.cycles-pp.noop_dirty_folio
      0.23 ±  4%      -0.0        0.20 ±  3%  perf-profile.children.cycles-pp.x64_sys_call
      0.15 ±  5%      -0.0        0.11 ±  4%  perf-profile.children.cycles-pp.security_file_permission
      0.28 ±  2%      -0.0        0.26        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.17 ±  5%      +0.0        0.19 ±  3%  perf-profile.children.cycles-pp.sched_tick
      1.18            +0.1        1.28 ±  2%  perf-profile.children.cycles-pp.down_write
      0.35 ±  3%      +0.1        0.48 ±  6%  perf-profile.children.cycles-pp.folio_mapping
      0.50 ±  2%      +0.2        0.69        perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
      0.55 ±  2%      +0.2        0.75        perf-profile.children.cycles-pp.folio_mark_accessed
      1.75 ±  2%      +0.4        2.10 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.90            +0.5        1.36 ±  5%  perf-profile.children.cycles-pp.folio_mark_dirty
      2.17            +1.2        3.41        perf-profile.children.cycles-pp.fault_in_readable
      2.40            +1.4        3.75        perf-profile.children.cycles-pp.fault_in_iov_iter_readable
     46.10            +2.8       48.93        perf-profile.children.cycles-pp.__x64_sys_pwrite64
     43.86            +3.2       47.10        perf-profile.children.cycles-pp.vfs_write
     39.00            +3.4       42.41        perf-profile.children.cycles-pp.shmem_file_write_iter
     59.15            +3.4       62.56        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     49.50            +3.7       53.21        perf-profile.children.cycles-pp.do_syscall_64
     29.56            +5.6       35.14        perf-profile.children.cycles-pp.generic_perform_write
      4.74            +8.3       13.02        perf-profile.children.cycles-pp.rep_movs_alternative
      6.85            +9.6       16.44        perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      4.34 ±  2%      -2.9        1.43 ±  2%  perf-profile.self.cycles-pp.filemap_get_entry
     14.06            -1.4       12.65        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
     16.74            -1.2       15.53        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.39 ± 10%      -1.2        0.21 ±  8%  perf-profile.self.cycles-pp.xas_load
      1.49 ±  3%      -0.9        0.58        perf-profile.self.cycles-pp.folio_unlock
      2.72 ±  2%      -0.9        1.83        perf-profile.self.cycles-pp.__libc_pwrite
      1.42 ± 13%      -0.8        0.61 ±  3%  perf-profile.self.cycles-pp.file_remove_privs_flags
      1.42            -0.6        0.83        perf-profile.self.cycles-pp.inode_needs_update_time
      1.92 ±  5%      -0.5        1.44        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      6.24            -0.4        5.81        perf-profile.self.cycles-pp.entry_SYSCALL_64
      9.82            -0.3        9.50        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.64 ±  3%      -0.3        0.38        perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      1.06 ±  2%      -0.3        0.79        perf-profile.self.cycles-pp.__cond_resched
      1.74 ±  5%      -0.2        1.52 ±  2%  perf-profile.self.cycles-pp.shmem_write_begin
      1.24 ±  2%      -0.2        1.03        perf-profile.self.cycles-pp.fdget
      0.45 ±  3%      -0.2        0.25        perf-profile.self.cycles-pp.file_update_time
      0.98 ±  2%      -0.2        0.79 ±  2%  perf-profile.self.cycles-pp.__x64_sys_pwrite64
      2.73 ±  2%      -0.2        2.54 ±  2%  perf-profile.self.cycles-pp.shmem_write_end
      0.72 ±  5%      -0.1        0.58 ±  4%  perf-profile.self.cycles-pp.generic_write_checks
      1.14            -0.1        1.02        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.36 ±  3%      -0.1        0.25 ±  2%  perf-profile.self.cycles-pp.timestamp_truncate
      0.23 ±  4%      -0.1        0.15 ±  2%  perf-profile.self.cycles-pp.rw_verify_area
      0.60 ±  3%      -0.1        0.53        perf-profile.self.cycles-pp.rcu_all_qs
      0.81            -0.1        0.74        perf-profile.self.cycles-pp.noop_dirty_folio
      0.20 ±  4%      -0.1        0.14 ±  2%  perf-profile.self.cycles-pp.xas_start
      0.81            -0.1        0.75 ±  2%  perf-profile.self.cycles-pp.up_write
      0.21 ±  3%      -0.0        0.18 ±  3%  perf-profile.self.cycles-pp.x64_sys_call
      0.26 ±  2%      -0.0        0.23 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.12 ±  6%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.security_file_permission
      0.21 ±  4%      +0.0        0.24        perf-profile.self.cycles-pp.testcase
      0.77 ±  2%      +0.0        0.82 ±  3%  perf-profile.self.cycles-pp.down_write
      0.24 ±  3%      +0.1        0.36        perf-profile.self.cycles-pp.fault_in_iov_iter_readable
      0.30 ±  3%      +0.1        0.43 ±  6%  perf-profile.self.cycles-pp.folio_mapping
      0.35 ±  2%      +0.2        0.54        perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
      2.74            +0.2        2.93 ±  2%  perf-profile.self.cycles-pp.generic_perform_write
      0.52            +0.2        0.72        perf-profile.self.cycles-pp.folio_mark_accessed
      0.55 ±  2%      +0.3        0.87 ±  5%  perf-profile.self.cycles-pp.folio_mark_dirty
      0.56            +0.5        1.10 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.48 ±  2%      +1.1        2.55 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      2.14            +1.2        3.35        perf-profile.self.cycles-pp.fault_in_readable
      2.20            +1.3        3.51 ±  2%  perf-profile.self.cycles-pp.copy_page_from_iter_atomic
      4.59            +8.2       12.80        perf-profile.self.cycles-pp.rep_movs_alternative




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [linus:master] [timekeeping]  ee3283c608: will-it-scale.per_process_ops 4.8% regression
  2025-01-26  8:25 [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression kernel test robot
@ 2025-01-26 12:23 ` Jeff Layton
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Layton @ 2025-01-26 12:23 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Thomas Gleixner,
	John Stultz

On Sun, 2025-01-26 at 16:25 +0800, kernel test robot wrote:
> hi, Jeff Layton,
> 
> 
> we make out below report just FYI since the results is stable in our tests.
> we don't have enough knowledge if this regression is due to align.
> 
> +static __cacheline_aligned_in_smp atomic64_t mg_floor;
> 
> if low value, please just ignore. thanks a lot.
> 


I think this is more or less the same regression we measured with the
pipe1 test during the rc phase:

    https://lore.kernel.org/linux-fsdevel/202410091041.6f5d221e-oliver.sang@intel.com/

This test just testing how fast it can do writes into a file in /tmp
without doing anything else in between. I don't think there is much we
can do to mitigate the perf hit here, as there is a basic cost to
fetching and handling the floor and ctime consistently.

> 
> Hello,
> 
> kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:
> 
> 
> commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")

That patch just adds two new interfaces, but the first caller of them
wasn't added until a later patch. Are you sure that bisect landed in
the right place?

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [test failed on linus/master      bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
> [test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
> 
> testcase: will-it-scale
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 104 threads 2 sockets (Skylake) with 192G memory
> parameters:
> 
> 	nr_task: 100%
> 	mode: process
> 	test: pwrite1
> 	cpufreq_governor: performance
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
>   gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale
> 
> commit: 
>   v6.12-rc2
>   ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")
> 
>        v6.12-rc2 ee3283c608dfa21251b0821d7bb 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>   57550068            -4.8%   54794800        will-it-scale.104.processes
>     553365            -4.8%     526872        will-it-scale.per_process_ops
>   57550068            -4.8%   54794800        will-it-scale.workload
>      43.00 ± 27%     -60.0%      17.20 ± 27%  perf-c2c.DRAM.local
>     251.20 ± 23%     -57.5%     106.80 ± 16%  perf-c2c.DRAM.remote
>     520.00 ± 33%     -70.3%     154.20 ± 13%  perf-c2c.HITM.local
>     218.50 ± 25%     -55.2%      97.80 ± 18%  perf-c2c.HITM.remote
>       0.03 ± 14%     +48.4%       0.04 ±  9%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>       4.18 ±  4%     +21.5%       5.08        perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
>     653.70 ±  5%     +50.5%     983.70 ±  7%  perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>     913.40 ±  6%     -24.8%     686.80 ±  7%  perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
>       1.29 ± 81%  +42618.3%     552.09 ± 74%  perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
>       2.58 ± 81%  +65403.1%       1692 ± 72%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
>  1.721e+10            -4.8%  1.639e+10        perf-stat.i.branch-instructions
>       1.66            +0.1        1.72        perf-stat.i.branch-miss-rate%
>  2.852e+08            -1.2%  2.818e+08        perf-stat.i.branch-misses
>       3.29            +4.9%       3.45        perf-stat.i.cpi
>  8.743e+10            -4.8%  8.327e+10        perf-stat.i.instructions
>       0.30            -4.7%       0.29        perf-stat.i.ipc
>       1.66            +0.1        1.72        perf-stat.overall.branch-miss-rate%
>       3.29            +4.9%       3.45        perf-stat.overall.cpi
>       0.30            -4.7%       0.29        perf-stat.overall.ipc
>  1.715e+10            -4.8%  1.634e+10        perf-stat.ps.branch-instructions
>  2.842e+08            -1.2%  2.809e+08        perf-stat.ps.branch-misses
>  8.714e+10            -4.8%    8.3e+10        perf-stat.ps.instructions
>  2.632e+13            -4.7%  2.508e+13        perf-stat.total.instructions
>      10.62            -4.8        5.81        perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>       8.89 ±  2%      -4.6        4.25        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
>       5.98 ±  3%      -4.2        1.79 ±  2%  perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
>      13.24            -1.4       11.88        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
>      16.62            -1.2       15.42        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
>       2.90            -1.2        1.74        perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
>       2.38 ±  2%      -0.9        1.44        perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>       1.68 ±  2%      -0.9        0.79        perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
>       1.42 ± 13%      -0.8        0.64 ±  3%  perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
>       5.69            -0.7        4.99 ±  2%  perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>       6.91            -0.4        6.53        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
>       1.23 ±  2%      -0.2        1.01        perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>       1.41            -0.2        1.26        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>       0.87            -0.1        0.79 ±  2%  perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
>       0.79 ±  2%      -0.1        0.74        perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
>       1.15 ±  2%      +0.1        1.26 ±  2%  perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
>       0.54            +0.2        0.73        perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
>       0.82 ±  2%      +0.4        1.26 ±  5%  perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
>       0.00            +0.7        0.67        perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>       2.10            +1.2        3.35        perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
>       2.36            +1.3        3.69        perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>      46.08            +2.8       48.91        perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>      43.76            +3.3       47.02        perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>      58.89            +3.4       62.32        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>      38.55            +3.5       42.07        perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      49.37            +3.7       53.09        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
>      29.41            +5.6       34.99        perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
>       4.60            +7.7       12.30        perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
>       6.68           +10.3       16.96        perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
>      10.69            -4.8        5.86        perf-profile.children.cycles-pp.shmem_write_begin
>       8.99 ±  2%      -4.6        4.35        perf-profile.children.cycles-pp.shmem_get_folio_gfp
>       6.00 ±  3%      -4.2        1.81 ±  2%  perf-profile.children.cycles-pp.filemap_get_entry
>      14.20            -1.4       12.77        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       1.62 ±  9%      -1.3        0.37 ±  5%  perf-profile.children.cycles-pp.xas_load
>      16.76            -1.2       15.54        perf-profile.children.cycles-pp.syscall_return_via_sysret
>       2.96            -1.2        1.79        perf-profile.children.cycles-pp.file_update_time
>       2.47 ±  2%      -1.0        1.51        perf-profile.children.cycles-pp.inode_needs_update_time
>       1.69 ±  2%      -0.9        0.79        perf-profile.children.cycles-pp.folio_unlock
>       1.44 ± 13%      -0.8        0.65 ±  3%  perf-profile.children.cycles-pp.file_remove_privs_flags
>       5.94            -0.7        5.24 ±  2%  perf-profile.children.cycles-pp.shmem_write_end
>       7.17            -0.5        6.67        perf-profile.children.cycles-pp.entry_SYSCALL_64
>       1.77            -0.4        1.42        perf-profile.children.cycles-pp.__cond_resched
>       0.67 ±  3%      -0.3        0.41        perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
>       1.68 ±  9%      -0.2        1.42 ±  4%  perf-profile.children.cycles-pp.generic_write_checks
>       1.25            -0.2        1.03        perf-profile.children.cycles-pp.fdget
>       1.44            -0.2        1.28        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>       0.38 ±  3%      -0.1        0.27 ±  2%  perf-profile.children.cycles-pp.timestamp_truncate
>       0.37 ±  4%      -0.1        0.26        perf-profile.children.cycles-pp.rw_verify_area
>       0.69 ±  3%      -0.1        0.60        perf-profile.children.cycles-pp.rcu_all_qs
>       0.90            -0.1        0.82 ±  2%  perf-profile.children.cycles-pp.up_write
>       0.23 ±  5%      -0.1        0.16 ±  2%  perf-profile.children.cycles-pp.xas_start
>       0.85            -0.1        0.80        perf-profile.children.cycles-pp.noop_dirty_folio
>       0.23 ±  4%      -0.0        0.20 ±  3%  perf-profile.children.cycles-pp.x64_sys_call
>       0.15 ±  5%      -0.0        0.11 ±  4%  perf-profile.children.cycles-pp.security_file_permission
>       0.28 ±  2%      -0.0        0.26        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.17 ±  5%      +0.0        0.19 ±  3%  perf-profile.children.cycles-pp.sched_tick
>       1.18            +0.1        1.28 ±  2%  perf-profile.children.cycles-pp.down_write
>       0.35 ±  3%      +0.1        0.48 ±  6%  perf-profile.children.cycles-pp.folio_mapping
>       0.50 ±  2%      +0.2        0.69        perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
>       0.55 ±  2%      +0.2        0.75        perf-profile.children.cycles-pp.folio_mark_accessed
>       1.75 ±  2%      +0.4        2.10 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.90            +0.5        1.36 ±  5%  perf-profile.children.cycles-pp.folio_mark_dirty
>       2.17            +1.2        3.41        perf-profile.children.cycles-pp.fault_in_readable
>       2.40            +1.4        3.75        perf-profile.children.cycles-pp.fault_in_iov_iter_readable
>      46.10            +2.8       48.93        perf-profile.children.cycles-pp.__x64_sys_pwrite64
>      43.86            +3.2       47.10        perf-profile.children.cycles-pp.vfs_write
>      39.00            +3.4       42.41        perf-profile.children.cycles-pp.shmem_file_write_iter
>      59.15            +3.4       62.56        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      49.50            +3.7       53.21        perf-profile.children.cycles-pp.do_syscall_64
>      29.56            +5.6       35.14        perf-profile.children.cycles-pp.generic_perform_write
>       4.74            +8.3       13.02        perf-profile.children.cycles-pp.rep_movs_alternative
>       6.85            +9.6       16.44        perf-profile.children.cycles-pp.copy_page_from_iter_atomic
>       4.34 ±  2%      -2.9        1.43 ±  2%  perf-profile.self.cycles-pp.filemap_get_entry
>      14.06            -1.4       12.65        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>      16.74            -1.2       15.53        perf-profile.self.cycles-pp.syscall_return_via_sysret
>       1.39 ± 10%      -1.2        0.21 ±  8%  perf-profile.self.cycles-pp.xas_load
>       1.49 ±  3%      -0.9        0.58        perf-profile.self.cycles-pp.folio_unlock
>       2.72 ±  2%      -0.9        1.83        perf-profile.self.cycles-pp.__libc_pwrite
>       1.42 ± 13%      -0.8        0.61 ±  3%  perf-profile.self.cycles-pp.file_remove_privs_flags
>       1.42            -0.6        0.83        perf-profile.self.cycles-pp.inode_needs_update_time
>       1.92 ±  5%      -0.5        1.44        perf-profile.self.cycles-pp.shmem_get_folio_gfp
>       6.24            -0.4        5.81        perf-profile.self.cycles-pp.entry_SYSCALL_64
>       9.82            -0.3        9.50        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.64 ±  3%      -0.3        0.38        perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
>       1.06 ±  2%      -0.3        0.79        perf-profile.self.cycles-pp.__cond_resched
>       1.74 ±  5%      -0.2        1.52 ±  2%  perf-profile.self.cycles-pp.shmem_write_begin
>       1.24 ±  2%      -0.2        1.03        perf-profile.self.cycles-pp.fdget
>       0.45 ±  3%      -0.2        0.25        perf-profile.self.cycles-pp.file_update_time
>       0.98 ±  2%      -0.2        0.79 ±  2%  perf-profile.self.cycles-pp.__x64_sys_pwrite64
>       2.73 ±  2%      -0.2        2.54 ±  2%  perf-profile.self.cycles-pp.shmem_write_end
>       0.72 ±  5%      -0.1        0.58 ±  4%  perf-profile.self.cycles-pp.generic_write_checks
>       1.14            -0.1        1.02        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>       0.36 ±  3%      -0.1        0.25 ±  2%  perf-profile.self.cycles-pp.timestamp_truncate
>       0.23 ±  4%      -0.1        0.15 ±  2%  perf-profile.self.cycles-pp.rw_verify_area
>       0.60 ±  3%      -0.1        0.53        perf-profile.self.cycles-pp.rcu_all_qs
>       0.81            -0.1        0.74        perf-profile.self.cycles-pp.noop_dirty_folio
>       0.20 ±  4%      -0.1        0.14 ±  2%  perf-profile.self.cycles-pp.xas_start
>       0.81            -0.1        0.75 ±  2%  perf-profile.self.cycles-pp.up_write
>       0.21 ±  3%      -0.0        0.18 ±  3%  perf-profile.self.cycles-pp.x64_sys_call
>       0.26 ±  2%      -0.0        0.23 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.12 ±  6%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.security_file_permission
>       0.21 ±  4%      +0.0        0.24        perf-profile.self.cycles-pp.testcase
>       0.77 ±  2%      +0.0        0.82 ±  3%  perf-profile.self.cycles-pp.down_write
>       0.24 ±  3%      +0.1        0.36        perf-profile.self.cycles-pp.fault_in_iov_iter_readable
>       0.30 ±  3%      +0.1        0.43 ±  6%  perf-profile.self.cycles-pp.folio_mapping
>       0.35 ±  2%      +0.2        0.54        perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
>       2.74            +0.2        2.93 ±  2%  perf-profile.self.cycles-pp.generic_perform_write
>       0.52            +0.2        0.72        perf-profile.self.cycles-pp.folio_mark_accessed
>       0.55 ±  2%      +0.3        0.87 ±  5%  perf-profile.self.cycles-pp.folio_mark_dirty
>       0.56            +0.5        1.10 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       1.48 ±  2%      +1.1        2.55 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
>       2.14            +1.2        3.35        perf-profile.self.cycles-pp.fault_in_readable
>       2.20            +1.3        3.51 ±  2%  perf-profile.self.cycles-pp.copy_page_from_iter_atomic
>       4.59            +8.2       12.80        perf-profile.self.cycles-pp.rep_movs_alternative
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-01-26 12:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-26  8:25 [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression kernel test robot
2025-01-26 12:23 ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox