* [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression
@ 2025-01-26 8:25 kernel test robot
2025-01-26 12:23 ` Jeff Layton
0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2025-01-26 8:25 UTC (permalink / raw)
To: Jeff Layton
Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Thomas Gleixner,
John Stultz, oliver.sang
hi, Jeff Layton,
we make out below report just FYI since the results is stable in our tests.
we don't have enough knowledge if this regression is due to align.
+static __cacheline_aligned_in_smp atomic64_t mg_floor;
if low value, please just ignore. thanks a lot.
Hello,
kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:
commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
[test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 100%
mode: process
test: pwrite1
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale
commit:
v6.12-rc2
ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")
v6.12-rc2 ee3283c608dfa21251b0821d7bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
57550068 -4.8% 54794800 will-it-scale.104.processes
553365 -4.8% 526872 will-it-scale.per_process_ops
57550068 -4.8% 54794800 will-it-scale.workload
43.00 ± 27% -60.0% 17.20 ± 27% perf-c2c.DRAM.local
251.20 ± 23% -57.5% 106.80 ± 16% perf-c2c.DRAM.remote
520.00 ± 33% -70.3% 154.20 ± 13% perf-c2c.HITM.local
218.50 ± 25% -55.2% 97.80 ± 18% perf-c2c.HITM.remote
0.03 ± 14% +48.4% 0.04 ± 9% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4.18 ± 4% +21.5% 5.08 perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
653.70 ± 5% +50.5% 983.70 ± 7% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
913.40 ± 6% -24.8% 686.80 ± 7% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
1.29 ± 81% +42618.3% 552.09 ± 74% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2.58 ± 81% +65403.1% 1692 ± 72% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1.721e+10 -4.8% 1.639e+10 perf-stat.i.branch-instructions
1.66 +0.1 1.72 perf-stat.i.branch-miss-rate%
2.852e+08 -1.2% 2.818e+08 perf-stat.i.branch-misses
3.29 +4.9% 3.45 perf-stat.i.cpi
8.743e+10 -4.8% 8.327e+10 perf-stat.i.instructions
0.30 -4.7% 0.29 perf-stat.i.ipc
1.66 +0.1 1.72 perf-stat.overall.branch-miss-rate%
3.29 +4.9% 3.45 perf-stat.overall.cpi
0.30 -4.7% 0.29 perf-stat.overall.ipc
1.715e+10 -4.8% 1.634e+10 perf-stat.ps.branch-instructions
2.842e+08 -1.2% 2.809e+08 perf-stat.ps.branch-misses
8.714e+10 -4.8% 8.3e+10 perf-stat.ps.instructions
2.632e+13 -4.7% 2.508e+13 perf-stat.total.instructions
10.62 -4.8 5.81 perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
8.89 ± 2% -4.6 4.25 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
5.98 ± 3% -4.2 1.79 ± 2% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
13.24 -1.4 11.88 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
16.62 -1.2 15.42 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
2.90 -1.2 1.74 perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
2.38 ± 2% -0.9 1.44 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
1.68 ± 2% -0.9 0.79 perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.42 ± 13% -0.8 0.64 ± 3% perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
5.69 -0.7 4.99 ± 2% perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
6.91 -0.4 6.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
1.23 ± 2% -0.2 1.01 perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
1.41 -0.2 1.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
0.87 -0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.79 ± 2% -0.1 0.74 perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.15 ± 2% +0.1 1.26 ± 2% perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.54 +0.2 0.73 perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
0.82 ± 2% +0.4 1.26 ± 5% perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
0.00 +0.7 0.67 perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
2.10 +1.2 3.35 perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
2.36 +1.3 3.69 perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
46.08 +2.8 48.91 perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
43.76 +3.3 47.02 perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
58.89 +3.4 62.32 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
38.55 +3.5 42.07 perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.37 +3.7 53.09 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
29.41 +5.6 34.99 perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
4.60 +7.7 12.30 perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
6.68 +10.3 16.96 perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
10.69 -4.8 5.86 perf-profile.children.cycles-pp.shmem_write_begin
8.99 ± 2% -4.6 4.35 perf-profile.children.cycles-pp.shmem_get_folio_gfp
6.00 ± 3% -4.2 1.81 ± 2% perf-profile.children.cycles-pp.filemap_get_entry
14.20 -1.4 12.77 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.62 ± 9% -1.3 0.37 ± 5% perf-profile.children.cycles-pp.xas_load
16.76 -1.2 15.54 perf-profile.children.cycles-pp.syscall_return_via_sysret
2.96 -1.2 1.79 perf-profile.children.cycles-pp.file_update_time
2.47 ± 2% -1.0 1.51 perf-profile.children.cycles-pp.inode_needs_update_time
1.69 ± 2% -0.9 0.79 perf-profile.children.cycles-pp.folio_unlock
1.44 ± 13% -0.8 0.65 ± 3% perf-profile.children.cycles-pp.file_remove_privs_flags
5.94 -0.7 5.24 ± 2% perf-profile.children.cycles-pp.shmem_write_end
7.17 -0.5 6.67 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.77 -0.4 1.42 perf-profile.children.cycles-pp.__cond_resched
0.67 ± 3% -0.3 0.41 perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
1.68 ± 9% -0.2 1.42 ± 4% perf-profile.children.cycles-pp.generic_write_checks
1.25 -0.2 1.03 perf-profile.children.cycles-pp.fdget
1.44 -0.2 1.28 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.38 ± 3% -0.1 0.27 ± 2% perf-profile.children.cycles-pp.timestamp_truncate
0.37 ± 4% -0.1 0.26 perf-profile.children.cycles-pp.rw_verify_area
0.69 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.rcu_all_qs
0.90 -0.1 0.82 ± 2% perf-profile.children.cycles-pp.up_write
0.23 ± 5% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.xas_start
0.85 -0.1 0.80 perf-profile.children.cycles-pp.noop_dirty_folio
0.23 ± 4% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.x64_sys_call
0.15 ± 5% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.security_file_permission
0.28 ± 2% -0.0 0.26 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.17 ± 5% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.sched_tick
1.18 +0.1 1.28 ± 2% perf-profile.children.cycles-pp.down_write
0.35 ± 3% +0.1 0.48 ± 6% perf-profile.children.cycles-pp.folio_mapping
0.50 ± 2% +0.2 0.69 perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.55 ± 2% +0.2 0.75 perf-profile.children.cycles-pp.folio_mark_accessed
1.75 ± 2% +0.4 2.10 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.90 +0.5 1.36 ± 5% perf-profile.children.cycles-pp.folio_mark_dirty
2.17 +1.2 3.41 perf-profile.children.cycles-pp.fault_in_readable
2.40 +1.4 3.75 perf-profile.children.cycles-pp.fault_in_iov_iter_readable
46.10 +2.8 48.93 perf-profile.children.cycles-pp.__x64_sys_pwrite64
43.86 +3.2 47.10 perf-profile.children.cycles-pp.vfs_write
39.00 +3.4 42.41 perf-profile.children.cycles-pp.shmem_file_write_iter
59.15 +3.4 62.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
49.50 +3.7 53.21 perf-profile.children.cycles-pp.do_syscall_64
29.56 +5.6 35.14 perf-profile.children.cycles-pp.generic_perform_write
4.74 +8.3 13.02 perf-profile.children.cycles-pp.rep_movs_alternative
6.85 +9.6 16.44 perf-profile.children.cycles-pp.copy_page_from_iter_atomic
4.34 ± 2% -2.9 1.43 ± 2% perf-profile.self.cycles-pp.filemap_get_entry
14.06 -1.4 12.65 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
16.74 -1.2 15.53 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.39 ± 10% -1.2 0.21 ± 8% perf-profile.self.cycles-pp.xas_load
1.49 ± 3% -0.9 0.58 perf-profile.self.cycles-pp.folio_unlock
2.72 ± 2% -0.9 1.83 perf-profile.self.cycles-pp.__libc_pwrite
1.42 ± 13% -0.8 0.61 ± 3% perf-profile.self.cycles-pp.file_remove_privs_flags
1.42 -0.6 0.83 perf-profile.self.cycles-pp.inode_needs_update_time
1.92 ± 5% -0.5 1.44 perf-profile.self.cycles-pp.shmem_get_folio_gfp
6.24 -0.4 5.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
9.82 -0.3 9.50 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.64 ± 3% -0.3 0.38 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
1.06 ± 2% -0.3 0.79 perf-profile.self.cycles-pp.__cond_resched
1.74 ± 5% -0.2 1.52 ± 2% perf-profile.self.cycles-pp.shmem_write_begin
1.24 ± 2% -0.2 1.03 perf-profile.self.cycles-pp.fdget
0.45 ± 3% -0.2 0.25 perf-profile.self.cycles-pp.file_update_time
0.98 ± 2% -0.2 0.79 ± 2% perf-profile.self.cycles-pp.__x64_sys_pwrite64
2.73 ± 2% -0.2 2.54 ± 2% perf-profile.self.cycles-pp.shmem_write_end
0.72 ± 5% -0.1 0.58 ± 4% perf-profile.self.cycles-pp.generic_write_checks
1.14 -0.1 1.02 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.36 ± 3% -0.1 0.25 ± 2% perf-profile.self.cycles-pp.timestamp_truncate
0.23 ± 4% -0.1 0.15 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.60 ± 3% -0.1 0.53 perf-profile.self.cycles-pp.rcu_all_qs
0.81 -0.1 0.74 perf-profile.self.cycles-pp.noop_dirty_folio
0.20 ± 4% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.xas_start
0.81 -0.1 0.75 ± 2% perf-profile.self.cycles-pp.up_write
0.21 ± 3% -0.0 0.18 ± 3% perf-profile.self.cycles-pp.x64_sys_call
0.26 ± 2% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.12 ± 6% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.security_file_permission
0.21 ± 4% +0.0 0.24 perf-profile.self.cycles-pp.testcase
0.77 ± 2% +0.0 0.82 ± 3% perf-profile.self.cycles-pp.down_write
0.24 ± 3% +0.1 0.36 perf-profile.self.cycles-pp.fault_in_iov_iter_readable
0.30 ± 3% +0.1 0.43 ± 6% perf-profile.self.cycles-pp.folio_mapping
0.35 ± 2% +0.2 0.54 perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
2.74 +0.2 2.93 ± 2% perf-profile.self.cycles-pp.generic_perform_write
0.52 +0.2 0.72 perf-profile.self.cycles-pp.folio_mark_accessed
0.55 ± 2% +0.3 0.87 ± 5% perf-profile.self.cycles-pp.folio_mark_dirty
0.56 +0.5 1.10 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.48 ± 2% +1.1 2.55 ± 4% perf-profile.self.cycles-pp.do_syscall_64
2.14 +1.2 3.35 perf-profile.self.cycles-pp.fault_in_readable
2.20 +1.3 3.51 ± 2% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
4.59 +8.2 12.80 perf-profile.self.cycles-pp.rep_movs_alternative
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression
2025-01-26 8:25 [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression kernel test robot
@ 2025-01-26 12:23 ` Jeff Layton
0 siblings, 0 replies; 2+ messages in thread
From: Jeff Layton @ 2025-01-26 12:23 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Thomas Gleixner,
John Stultz
On Sun, 2025-01-26 at 16:25 +0800, kernel test robot wrote:
> hi, Jeff Layton,
>
>
> we make out below report just FYI since the results is stable in our tests.
> we don't have enough knowledge if this regression is due to align.
>
> +static __cacheline_aligned_in_smp atomic64_t mg_floor;
>
> if low value, please just ignore. thanks a lot.
>
I think this is more or less the same regression we measured with the
pipe1 test during the rc phase:
https://lore.kernel.org/linux-fsdevel/202410091041.6f5d221e-oliver.sang@intel.com/
This test just testing how fast it can do writes into a file in /tmp
without doing anything else in between. I don't think there is much we
can do to mitigate the perf hit here, as there is a basic cost to
fetching and handling the floor and ctime consistently.
>
> Hello,
>
> kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:
>
>
> commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")
That patch just adds two new interfaces, but the first caller of them
wasn't added until a later patch. Are you sure that bisect landed in
the right place?
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
> [test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
>
> testcase: will-it-scale
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 104 threads 2 sockets (Skylake) with 192G memory
> parameters:
>
> nr_task: 100%
> mode: process
> test: pwrite1
> cpufreq_governor: performance
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
> gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale
>
> commit:
> v6.12-rc2
> ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")
>
> v6.12-rc2 ee3283c608dfa21251b0821d7bb
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 57550068 -4.8% 54794800 will-it-scale.104.processes
> 553365 -4.8% 526872 will-it-scale.per_process_ops
> 57550068 -4.8% 54794800 will-it-scale.workload
> 43.00 ± 27% -60.0% 17.20 ± 27% perf-c2c.DRAM.local
> 251.20 ± 23% -57.5% 106.80 ± 16% perf-c2c.DRAM.remote
> 520.00 ± 33% -70.3% 154.20 ± 13% perf-c2c.HITM.local
> 218.50 ± 25% -55.2% 97.80 ± 18% perf-c2c.HITM.remote
> 0.03 ± 14% +48.4% 0.04 ± 9% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 4.18 ± 4% +21.5% 5.08 perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 653.70 ± 5% +50.5% 983.70 ± 7% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 913.40 ± 6% -24.8% 686.80 ± 7% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 1.29 ± 81% +42618.3% 552.09 ± 74% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
> 2.58 ± 81% +65403.1% 1692 ± 72% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
> 1.721e+10 -4.8% 1.639e+10 perf-stat.i.branch-instructions
> 1.66 +0.1 1.72 perf-stat.i.branch-miss-rate%
> 2.852e+08 -1.2% 2.818e+08 perf-stat.i.branch-misses
> 3.29 +4.9% 3.45 perf-stat.i.cpi
> 8.743e+10 -4.8% 8.327e+10 perf-stat.i.instructions
> 0.30 -4.7% 0.29 perf-stat.i.ipc
> 1.66 +0.1 1.72 perf-stat.overall.branch-miss-rate%
> 3.29 +4.9% 3.45 perf-stat.overall.cpi
> 0.30 -4.7% 0.29 perf-stat.overall.ipc
> 1.715e+10 -4.8% 1.634e+10 perf-stat.ps.branch-instructions
> 2.842e+08 -1.2% 2.809e+08 perf-stat.ps.branch-misses
> 8.714e+10 -4.8% 8.3e+10 perf-stat.ps.instructions
> 2.632e+13 -4.7% 2.508e+13 perf-stat.total.instructions
> 10.62 -4.8 5.81 perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 8.89 ± 2% -4.6 4.25 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
> 5.98 ± 3% -4.2 1.79 ± 2% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 13.24 -1.4 11.88 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
> 16.62 -1.2 15.42 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
> 2.90 -1.2 1.74 perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
> 2.38 ± 2% -0.9 1.44 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 1.68 ± 2% -0.9 0.79 perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
> 1.42 ± 13% -0.8 0.64 ± 3% perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
> 5.69 -0.7 4.99 ± 2% perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 6.91 -0.4 6.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
> 1.23 ± 2% -0.2 1.01 perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 1.41 -0.2 1.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 0.87 -0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
> 0.79 ± 2% -0.1 0.74 perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
> 1.15 ± 2% +0.1 1.26 ± 2% perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
> 0.54 +0.2 0.73 perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 0.82 ± 2% +0.4 1.26 ± 5% perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
> 0.00 +0.7 0.67 perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 2.10 +1.2 3.35 perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
> 2.36 +1.3 3.69 perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 46.08 +2.8 48.91 perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 43.76 +3.3 47.02 perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 58.89 +3.4 62.32 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 38.55 +3.5 42.07 perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 49.37 +3.7 53.09 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
> 29.41 +5.6 34.99 perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
> 4.60 +7.7 12.30 perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
> 6.68 +10.3 16.96 perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
> 10.69 -4.8 5.86 perf-profile.children.cycles-pp.shmem_write_begin
> 8.99 ± 2% -4.6 4.35 perf-profile.children.cycles-pp.shmem_get_folio_gfp
> 6.00 ± 3% -4.2 1.81 ± 2% perf-profile.children.cycles-pp.filemap_get_entry
> 14.20 -1.4 12.77 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 1.62 ± 9% -1.3 0.37 ± 5% perf-profile.children.cycles-pp.xas_load
> 16.76 -1.2 15.54 perf-profile.children.cycles-pp.syscall_return_via_sysret
> 2.96 -1.2 1.79 perf-profile.children.cycles-pp.file_update_time
> 2.47 ± 2% -1.0 1.51 perf-profile.children.cycles-pp.inode_needs_update_time
> 1.69 ± 2% -0.9 0.79 perf-profile.children.cycles-pp.folio_unlock
> 1.44 ± 13% -0.8 0.65 ± 3% perf-profile.children.cycles-pp.file_remove_privs_flags
> 5.94 -0.7 5.24 ± 2% perf-profile.children.cycles-pp.shmem_write_end
> 7.17 -0.5 6.67 perf-profile.children.cycles-pp.entry_SYSCALL_64
> 1.77 -0.4 1.42 perf-profile.children.cycles-pp.__cond_resched
> 0.67 ± 3% -0.3 0.41 perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
> 1.68 ± 9% -0.2 1.42 ± 4% perf-profile.children.cycles-pp.generic_write_checks
> 1.25 -0.2 1.03 perf-profile.children.cycles-pp.fdget
> 1.44 -0.2 1.28 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> 0.38 ± 3% -0.1 0.27 ± 2% perf-profile.children.cycles-pp.timestamp_truncate
> 0.37 ± 4% -0.1 0.26 perf-profile.children.cycles-pp.rw_verify_area
> 0.69 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.rcu_all_qs
> 0.90 -0.1 0.82 ± 2% perf-profile.children.cycles-pp.up_write
> 0.23 ± 5% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.xas_start
> 0.85 -0.1 0.80 perf-profile.children.cycles-pp.noop_dirty_folio
> 0.23 ± 4% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.x64_sys_call
> 0.15 ± 5% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.security_file_permission
> 0.28 ± 2% -0.0 0.26 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> 0.17 ± 5% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.sched_tick
> 1.18 +0.1 1.28 ± 2% perf-profile.children.cycles-pp.down_write
> 0.35 ± 3% +0.1 0.48 ± 6% perf-profile.children.cycles-pp.folio_mapping
> 0.50 ± 2% +0.2 0.69 perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
> 0.55 ± 2% +0.2 0.75 perf-profile.children.cycles-pp.folio_mark_accessed
> 1.75 ± 2% +0.4 2.10 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> 0.90 +0.5 1.36 ± 5% perf-profile.children.cycles-pp.folio_mark_dirty
> 2.17 +1.2 3.41 perf-profile.children.cycles-pp.fault_in_readable
> 2.40 +1.4 3.75 perf-profile.children.cycles-pp.fault_in_iov_iter_readable
> 46.10 +2.8 48.93 perf-profile.children.cycles-pp.__x64_sys_pwrite64
> 43.86 +3.2 47.10 perf-profile.children.cycles-pp.vfs_write
> 39.00 +3.4 42.41 perf-profile.children.cycles-pp.shmem_file_write_iter
> 59.15 +3.4 62.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 49.50 +3.7 53.21 perf-profile.children.cycles-pp.do_syscall_64
> 29.56 +5.6 35.14 perf-profile.children.cycles-pp.generic_perform_write
> 4.74 +8.3 13.02 perf-profile.children.cycles-pp.rep_movs_alternative
> 6.85 +9.6 16.44 perf-profile.children.cycles-pp.copy_page_from_iter_atomic
> 4.34 ± 2% -2.9 1.43 ± 2% perf-profile.self.cycles-pp.filemap_get_entry
> 14.06 -1.4 12.65 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 16.74 -1.2 15.53 perf-profile.self.cycles-pp.syscall_return_via_sysret
> 1.39 ± 10% -1.2 0.21 ± 8% perf-profile.self.cycles-pp.xas_load
> 1.49 ± 3% -0.9 0.58 perf-profile.self.cycles-pp.folio_unlock
> 2.72 ± 2% -0.9 1.83 perf-profile.self.cycles-pp.__libc_pwrite
> 1.42 ± 13% -0.8 0.61 ± 3% perf-profile.self.cycles-pp.file_remove_privs_flags
> 1.42 -0.6 0.83 perf-profile.self.cycles-pp.inode_needs_update_time
> 1.92 ± 5% -0.5 1.44 perf-profile.self.cycles-pp.shmem_get_folio_gfp
> 6.24 -0.4 5.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
> 9.82 -0.3 9.50 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> 0.64 ± 3% -0.3 0.38 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
> 1.06 ± 2% -0.3 0.79 perf-profile.self.cycles-pp.__cond_resched
> 1.74 ± 5% -0.2 1.52 ± 2% perf-profile.self.cycles-pp.shmem_write_begin
> 1.24 ± 2% -0.2 1.03 perf-profile.self.cycles-pp.fdget
> 0.45 ± 3% -0.2 0.25 perf-profile.self.cycles-pp.file_update_time
> 0.98 ± 2% -0.2 0.79 ± 2% perf-profile.self.cycles-pp.__x64_sys_pwrite64
> 2.73 ± 2% -0.2 2.54 ± 2% perf-profile.self.cycles-pp.shmem_write_end
> 0.72 ± 5% -0.1 0.58 ± 4% perf-profile.self.cycles-pp.generic_write_checks
> 1.14 -0.1 1.02 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> 0.36 ± 3% -0.1 0.25 ± 2% perf-profile.self.cycles-pp.timestamp_truncate
> 0.23 ± 4% -0.1 0.15 ± 2% perf-profile.self.cycles-pp.rw_verify_area
> 0.60 ± 3% -0.1 0.53 perf-profile.self.cycles-pp.rcu_all_qs
> 0.81 -0.1 0.74 perf-profile.self.cycles-pp.noop_dirty_folio
> 0.20 ± 4% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.xas_start
> 0.81 -0.1 0.75 ± 2% perf-profile.self.cycles-pp.up_write
> 0.21 ± 3% -0.0 0.18 ± 3% perf-profile.self.cycles-pp.x64_sys_call
> 0.26 ± 2% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
> 0.12 ± 6% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.security_file_permission
> 0.21 ± 4% +0.0 0.24 perf-profile.self.cycles-pp.testcase
> 0.77 ± 2% +0.0 0.82 ± 3% perf-profile.self.cycles-pp.down_write
> 0.24 ± 3% +0.1 0.36 perf-profile.self.cycles-pp.fault_in_iov_iter_readable
> 0.30 ± 3% +0.1 0.43 ± 6% perf-profile.self.cycles-pp.folio_mapping
> 0.35 ± 2% +0.2 0.54 perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
> 2.74 +0.2 2.93 ± 2% perf-profile.self.cycles-pp.generic_perform_write
> 0.52 +0.2 0.72 perf-profile.self.cycles-pp.folio_mark_accessed
> 0.55 ± 2% +0.3 0.87 ± 5% perf-profile.self.cycles-pp.folio_mark_dirty
> 0.56 +0.5 1.10 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
> 1.48 ± 2% +1.1 2.55 ± 4% perf-profile.self.cycles-pp.do_syscall_64
> 2.14 +1.2 3.35 perf-profile.self.cycles-pp.fault_in_readable
> 2.20 +1.3 3.51 ± 2% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
> 4.59 +8.2 12.80 perf-profile.self.cycles-pp.rep_movs_alternative
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-01-26 12:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-26 8:25 [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression kernel test robot
2025-01-26 12:23 ` Jeff Layton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox