From: kernel test robot <oliver.sang@intel.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>,
Christian Brauner <brauner@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
John Stultz <jstultz@google.com>, <oliver.sang@intel.com>
Subject: [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression
Date: Sun, 26 Jan 2025 16:25:58 +0800 [thread overview]
Message-ID: <202501261527.c3bf4764-lkp@intel.com> (raw)
hi, Jeff Layton,
we make out below report just FYI since the results is stable in our tests.
we don't have enough knowledge if this regression is due to align.
+static __cacheline_aligned_in_smp atomic64_t mg_floor;
if low value, please just ignore. thanks a lot.
Hello,
kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:
commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
[test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 100%
mode: process
test: pwrite1
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale
commit:
v6.12-rc2
ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")
v6.12-rc2 ee3283c608dfa21251b0821d7bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
57550068 -4.8% 54794800 will-it-scale.104.processes
553365 -4.8% 526872 will-it-scale.per_process_ops
57550068 -4.8% 54794800 will-it-scale.workload
43.00 ± 27% -60.0% 17.20 ± 27% perf-c2c.DRAM.local
251.20 ± 23% -57.5% 106.80 ± 16% perf-c2c.DRAM.remote
520.00 ± 33% -70.3% 154.20 ± 13% perf-c2c.HITM.local
218.50 ± 25% -55.2% 97.80 ± 18% perf-c2c.HITM.remote
0.03 ± 14% +48.4% 0.04 ± 9% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4.18 ± 4% +21.5% 5.08 perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
653.70 ± 5% +50.5% 983.70 ± 7% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
913.40 ± 6% -24.8% 686.80 ± 7% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
1.29 ± 81% +42618.3% 552.09 ± 74% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2.58 ± 81% +65403.1% 1692 ± 72% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1.721e+10 -4.8% 1.639e+10 perf-stat.i.branch-instructions
1.66 +0.1 1.72 perf-stat.i.branch-miss-rate%
2.852e+08 -1.2% 2.818e+08 perf-stat.i.branch-misses
3.29 +4.9% 3.45 perf-stat.i.cpi
8.743e+10 -4.8% 8.327e+10 perf-stat.i.instructions
0.30 -4.7% 0.29 perf-stat.i.ipc
1.66 +0.1 1.72 perf-stat.overall.branch-miss-rate%
3.29 +4.9% 3.45 perf-stat.overall.cpi
0.30 -4.7% 0.29 perf-stat.overall.ipc
1.715e+10 -4.8% 1.634e+10 perf-stat.ps.branch-instructions
2.842e+08 -1.2% 2.809e+08 perf-stat.ps.branch-misses
8.714e+10 -4.8% 8.3e+10 perf-stat.ps.instructions
2.632e+13 -4.7% 2.508e+13 perf-stat.total.instructions
10.62 -4.8 5.81 perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
8.89 ± 2% -4.6 4.25 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
5.98 ± 3% -4.2 1.79 ± 2% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
13.24 -1.4 11.88 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
16.62 -1.2 15.42 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
2.90 -1.2 1.74 perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
2.38 ± 2% -0.9 1.44 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
1.68 ± 2% -0.9 0.79 perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.42 ± 13% -0.8 0.64 ± 3% perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
5.69 -0.7 4.99 ± 2% perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
6.91 -0.4 6.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
1.23 ± 2% -0.2 1.01 perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
1.41 -0.2 1.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
0.87 -0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.79 ± 2% -0.1 0.74 perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.15 ± 2% +0.1 1.26 ± 2% perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.54 +0.2 0.73 perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
0.82 ± 2% +0.4 1.26 ± 5% perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
0.00 +0.7 0.67 perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
2.10 +1.2 3.35 perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
2.36 +1.3 3.69 perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
46.08 +2.8 48.91 perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
43.76 +3.3 47.02 perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
58.89 +3.4 62.32 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
38.55 +3.5 42.07 perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.37 +3.7 53.09 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
29.41 +5.6 34.99 perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
4.60 +7.7 12.30 perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
6.68 +10.3 16.96 perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
10.69 -4.8 5.86 perf-profile.children.cycles-pp.shmem_write_begin
8.99 ± 2% -4.6 4.35 perf-profile.children.cycles-pp.shmem_get_folio_gfp
6.00 ± 3% -4.2 1.81 ± 2% perf-profile.children.cycles-pp.filemap_get_entry
14.20 -1.4 12.77 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.62 ± 9% -1.3 0.37 ± 5% perf-profile.children.cycles-pp.xas_load
16.76 -1.2 15.54 perf-profile.children.cycles-pp.syscall_return_via_sysret
2.96 -1.2 1.79 perf-profile.children.cycles-pp.file_update_time
2.47 ± 2% -1.0 1.51 perf-profile.children.cycles-pp.inode_needs_update_time
1.69 ± 2% -0.9 0.79 perf-profile.children.cycles-pp.folio_unlock
1.44 ± 13% -0.8 0.65 ± 3% perf-profile.children.cycles-pp.file_remove_privs_flags
5.94 -0.7 5.24 ± 2% perf-profile.children.cycles-pp.shmem_write_end
7.17 -0.5 6.67 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.77 -0.4 1.42 perf-profile.children.cycles-pp.__cond_resched
0.67 ± 3% -0.3 0.41 perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
1.68 ± 9% -0.2 1.42 ± 4% perf-profile.children.cycles-pp.generic_write_checks
1.25 -0.2 1.03 perf-profile.children.cycles-pp.fdget
1.44 -0.2 1.28 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.38 ± 3% -0.1 0.27 ± 2% perf-profile.children.cycles-pp.timestamp_truncate
0.37 ± 4% -0.1 0.26 perf-profile.children.cycles-pp.rw_verify_area
0.69 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.rcu_all_qs
0.90 -0.1 0.82 ± 2% perf-profile.children.cycles-pp.up_write
0.23 ± 5% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.xas_start
0.85 -0.1 0.80 perf-profile.children.cycles-pp.noop_dirty_folio
0.23 ± 4% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.x64_sys_call
0.15 ± 5% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.security_file_permission
0.28 ± 2% -0.0 0.26 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.17 ± 5% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.sched_tick
1.18 +0.1 1.28 ± 2% perf-profile.children.cycles-pp.down_write
0.35 ± 3% +0.1 0.48 ± 6% perf-profile.children.cycles-pp.folio_mapping
0.50 ± 2% +0.2 0.69 perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.55 ± 2% +0.2 0.75 perf-profile.children.cycles-pp.folio_mark_accessed
1.75 ± 2% +0.4 2.10 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.90 +0.5 1.36 ± 5% perf-profile.children.cycles-pp.folio_mark_dirty
2.17 +1.2 3.41 perf-profile.children.cycles-pp.fault_in_readable
2.40 +1.4 3.75 perf-profile.children.cycles-pp.fault_in_iov_iter_readable
46.10 +2.8 48.93 perf-profile.children.cycles-pp.__x64_sys_pwrite64
43.86 +3.2 47.10 perf-profile.children.cycles-pp.vfs_write
39.00 +3.4 42.41 perf-profile.children.cycles-pp.shmem_file_write_iter
59.15 +3.4 62.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
49.50 +3.7 53.21 perf-profile.children.cycles-pp.do_syscall_64
29.56 +5.6 35.14 perf-profile.children.cycles-pp.generic_perform_write
4.74 +8.3 13.02 perf-profile.children.cycles-pp.rep_movs_alternative
6.85 +9.6 16.44 perf-profile.children.cycles-pp.copy_page_from_iter_atomic
4.34 ± 2% -2.9 1.43 ± 2% perf-profile.self.cycles-pp.filemap_get_entry
14.06 -1.4 12.65 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
16.74 -1.2 15.53 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.39 ± 10% -1.2 0.21 ± 8% perf-profile.self.cycles-pp.xas_load
1.49 ± 3% -0.9 0.58 perf-profile.self.cycles-pp.folio_unlock
2.72 ± 2% -0.9 1.83 perf-profile.self.cycles-pp.__libc_pwrite
1.42 ± 13% -0.8 0.61 ± 3% perf-profile.self.cycles-pp.file_remove_privs_flags
1.42 -0.6 0.83 perf-profile.self.cycles-pp.inode_needs_update_time
1.92 ± 5% -0.5 1.44 perf-profile.self.cycles-pp.shmem_get_folio_gfp
6.24 -0.4 5.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
9.82 -0.3 9.50 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.64 ± 3% -0.3 0.38 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
1.06 ± 2% -0.3 0.79 perf-profile.self.cycles-pp.__cond_resched
1.74 ± 5% -0.2 1.52 ± 2% perf-profile.self.cycles-pp.shmem_write_begin
1.24 ± 2% -0.2 1.03 perf-profile.self.cycles-pp.fdget
0.45 ± 3% -0.2 0.25 perf-profile.self.cycles-pp.file_update_time
0.98 ± 2% -0.2 0.79 ± 2% perf-profile.self.cycles-pp.__x64_sys_pwrite64
2.73 ± 2% -0.2 2.54 ± 2% perf-profile.self.cycles-pp.shmem_write_end
0.72 ± 5% -0.1 0.58 ± 4% perf-profile.self.cycles-pp.generic_write_checks
1.14 -0.1 1.02 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.36 ± 3% -0.1 0.25 ± 2% perf-profile.self.cycles-pp.timestamp_truncate
0.23 ± 4% -0.1 0.15 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.60 ± 3% -0.1 0.53 perf-profile.self.cycles-pp.rcu_all_qs
0.81 -0.1 0.74 perf-profile.self.cycles-pp.noop_dirty_folio
0.20 ± 4% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.xas_start
0.81 -0.1 0.75 ± 2% perf-profile.self.cycles-pp.up_write
0.21 ± 3% -0.0 0.18 ± 3% perf-profile.self.cycles-pp.x64_sys_call
0.26 ± 2% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.12 ± 6% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.security_file_permission
0.21 ± 4% +0.0 0.24 perf-profile.self.cycles-pp.testcase
0.77 ± 2% +0.0 0.82 ± 3% perf-profile.self.cycles-pp.down_write
0.24 ± 3% +0.1 0.36 perf-profile.self.cycles-pp.fault_in_iov_iter_readable
0.30 ± 3% +0.1 0.43 ± 6% perf-profile.self.cycles-pp.folio_mapping
0.35 ± 2% +0.2 0.54 perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
2.74 +0.2 2.93 ± 2% perf-profile.self.cycles-pp.generic_perform_write
0.52 +0.2 0.72 perf-profile.self.cycles-pp.folio_mark_accessed
0.55 ± 2% +0.3 0.87 ± 5% perf-profile.self.cycles-pp.folio_mark_dirty
0.56 +0.5 1.10 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.48 ± 2% +1.1 2.55 ± 4% perf-profile.self.cycles-pp.do_syscall_64
2.14 +1.2 3.35 perf-profile.self.cycles-pp.fault_in_readable
2.20 +1.3 3.51 ± 2% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
4.59 +8.2 12.80 perf-profile.self.cycles-pp.rep_movs_alternative
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2025-01-26 8:26 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-26 8:25 kernel test robot [this message]
2025-01-26 12:23 ` [linus:master] [timekeeping] ee3283c608: will-it-scale.per_process_ops 4.8% regression Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202501261527.c3bf4764-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=brauner@kernel.org \
--cc=jlayton@kernel.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox