All of lore.kernel.org
 help / color / mirror / Atom feed
From: kenel test robot <oliver.sang@intel.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
	<linux-fsdevel@vger.kernel.org>, <ying.huang@intel.com>,
	<feng.tang@intel.com>, <fengwei.yin@intel.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [remap_range]  dfad37051a: stress-ng.file-ioctl.ops_per_sec -11.2% regression
Date: Wed, 31 Jan 2024 22:13:16 +0800	[thread overview]
Message-ID: <202401312229.eddeb9a6-oliver.sang@intel.com> (raw)



Hello,

kernel test robot noticed a -11.2% regression of stress-ng.file-ioctl.ops_per_sec on:


commit: dfad37051ade6ac0d404ef4913f3bd01954ee51c ("remap_range: move permission hooks out of do_clone_file_range()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 10%
	disk: 1HDD
	testtime: 60s
	fs: btrfs
	test: file-ioctl
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240131/202401312229.eddeb9a6-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/file-ioctl/stress-ng/60s

commit: 
  d53471ba6f ("splice: remove permission hook from iter_file_splice_write()")
  dfad37051a ("remap_range: move permission hooks out of do_clone_file_range()")

d53471ba6f7ae97a dfad37051ade6ac0d404ef4913f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2.57            -0.3        2.27        mpstat.cpu.all.usr%
      7.40            +3.4%       7.65        iostat.cpu.system
      2.50           -11.5%       2.22        iostat.cpu.user
  95739218           -11.2%   84990543 ±  2%  stress-ng.file-ioctl.ops
   1595650           -11.2%    1416506 ±  2%  stress-ng.file-ioctl.ops_per_sec
    267.41            +4.2%     278.66        stress-ng.time.system_time
     90.19           -12.5%      78.96        stress-ng.time.user_time
      0.12 ±  9%     +37.6%       0.16 ±  3%  perf-stat.i.MPKI
 5.619e+09            -4.9%  5.346e+09        perf-stat.i.branch-instructions
     25.26 ± 12%      +5.4       30.67 ±  2%  perf-stat.i.cache-miss-rate%
   3226271 ±  8%     +32.3%    4268159 ±  2%  perf-stat.i.cache-misses
  13880671 ±  2%      +7.6%   14934433        perf-stat.i.cache-references
      0.83            +3.9%       0.86        perf-stat.i.cpi
      7405 ±  8%     -26.1%       5473 ±  2%  perf-stat.i.cycles-between-cache-misses
 5.186e+09            -6.0%  4.873e+09        perf-stat.i.dTLB-stores
 2.807e+10            -3.9%  2.696e+10        perf-stat.i.instructions
      1.21            -3.7%       1.17        perf-stat.i.ipc
    257.16           +12.9%     290.46        perf-stat.i.metric.K/sec
    290.80            -4.2%     278.45        perf-stat.i.metric.M/sec
   1580051 ± 11%     +38.0%    2180479 ±  5%  perf-stat.i.node-load-misses
    228848 ± 22%    +116.2%     494834 ± 27%  perf-stat.i.node-loads
      0.11 ±  9%     +37.7%       0.16 ±  3%  perf-stat.overall.MPKI
     23.29 ± 11%      +5.3       28.58 ±  2%  perf-stat.overall.cache-miss-rate%
      0.82            +3.9%       0.86        perf-stat.overall.cpi
      7231 ±  8%     -25.1%       5416 ±  2%  perf-stat.overall.cycles-between-cache-misses
      1.21            -3.7%       1.17        perf-stat.overall.ipc
 5.524e+09            -4.8%  5.257e+09        perf-stat.ps.branch-instructions
   3170718 ±  8%     +32.4%    4196610 ±  2%  perf-stat.ps.cache-misses
  13646445 ±  2%      +7.6%   14686495 ±  2%  perf-stat.ps.cache-references
 5.099e+09            -6.0%  4.792e+09        perf-stat.ps.dTLB-stores
 2.759e+10            -3.9%  2.651e+10        perf-stat.ps.instructions
   1553350 ± 11%     +38.1%    2144498 ±  5%  perf-stat.ps.node-load-misses
    224907 ± 22%    +116.2%     486304 ± 27%  perf-stat.ps.node-loads
 1.668e+12            -3.4%  1.611e+12 ±  2%  perf-stat.total.instructions
      5.57 ±  3%      -0.7        4.85 ±  2%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      0.89 ± 23%      -0.4        0.45 ± 44%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      2.30 ±  2%      -0.3        2.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.69 ±  3%      -0.3        1.39 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      1.99 ±  2%      -0.3        1.72        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.16 ±  3%      -0.2        1.00 ±  3%  perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.60 ±  4%      -0.2        0.44 ± 45%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.5        1.52 ±  2%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
      0.00            +6.9        6.94 ±  6%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl
      0.00            +7.4        7.41 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
     21.11            +7.4       28.53        perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      3.18 ±  2%      +8.7       11.87 ±  3%  perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.46 ±  9%      +8.9       10.36 ±  4%  perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
     10.70            -1.3        9.39 ±  3%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     11.31            -1.1       10.24 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      7.87 ±  3%      -1.0        6.90        perf-profile.children.cycles-pp.__fget_light
      5.13            -0.7        4.46 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.89            -0.4        0.46 ±  5%  perf-profile.children.cycles-pp.do_clone_file_range
      3.45 ±  2%      -0.4        3.10        perf-profile.children.cycles-pp.llseek
      1.80 ±  4%      -0.3        1.49 ±  3%  perf-profile.children.cycles-pp.stress_file_ioctl
      1.83            -0.2        1.63 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.53 ±  3%      -0.2        1.34 ±  4%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      2.32 ±  3%      -0.2        2.13        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.58 ±  2%      -0.2        1.40        perf-profile.children.cycles-pp.memdup_user
      1.81            -0.2        1.62        perf-profile.children.cycles-pp.__get_user_4
      1.26 ±  3%      -0.2        1.08 ±  3%  perf-profile.children.cycles-pp.__x64_sys_fcntl
      1.32 ±  2%      -0.2        1.14 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      2.06 ±  2%      -0.2        1.90 ±  3%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      1.12 ±  3%      -0.1        0.99 ±  2%  perf-profile.children.cycles-pp.security_file_ioctl
      0.84 ±  3%      -0.1        0.73 ±  3%  perf-profile.children.cycles-pp.ksys_lseek
      0.29 ±  4%      -0.1        0.18 ±  4%  perf-profile.children.cycles-pp.generic_file_rw_checks
      0.76 ±  3%      -0.1        0.68        perf-profile.children.cycles-pp.amd_clear_divider
      0.84 ±  3%      -0.1        0.75 ±  3%  perf-profile.children.cycles-pp.__put_user_4
      0.86 ±  4%      -0.1        0.78 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock
      0.53 ±  3%      -0.1        0.46 ±  4%  perf-profile.children.cycles-pp.__fdget_pos
      0.19 ± 11%      -0.1        0.12 ± 10%  perf-profile.children.cycles-pp.stress_mwc8
      0.54 ±  5%      -0.1        0.48 ±  6%  perf-profile.children.cycles-pp.__check_object_size
      0.73 ±  2%      -0.1        0.67 ±  5%  perf-profile.children.cycles-pp.__fdget
      0.49 ±  2%      -0.1        0.43 ±  3%  perf-profile.children.cycles-pp.__kmalloc_node_track_caller
      0.51 ±  4%      -0.1        0.45 ±  5%  perf-profile.children.cycles-pp.ioctl@plt
      0.58 ±  3%      -0.0        0.54 ±  4%  perf-profile.children.cycles-pp.__get_user_2
      0.38 ±  3%      -0.0        0.33 ±  4%  perf-profile.children.cycles-pp.__kmem_cache_alloc_node
      0.44 ±  3%      -0.0        0.40 ±  3%  perf-profile.children.cycles-pp.__libc_fcntl64
      0.24 ±  6%      -0.0        0.20 ±  7%  perf-profile.children.cycles-pp.do_fcntl
      0.48 ±  3%      -0.0        0.44 ±  2%  perf-profile.children.cycles-pp.set_close_on_exec
      0.16 ±  8%      -0.0        0.14 ±  8%  perf-profile.children.cycles-pp.__check_heap_object
      0.00            +0.2        0.25 ±  4%  perf-profile.children.cycles-pp.fsnotify_perm
      0.57            +0.6        1.15 ±  3%  perf-profile.children.cycles-pp.aa_file_perm
     85.52            +1.4       86.91        perf-profile.children.cycles-pp.ioctl
      0.00            +1.6        1.55        perf-profile.children.cycles-pp.__fsnotify_parent
     62.60            +4.0       66.55        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     59.77            +4.3       64.05        perf-profile.children.cycles-pp.do_syscall_64
     47.98            +5.7       53.66        perf-profile.children.cycles-pp.__x64_sys_ioctl
     21.64            +7.3       28.98        perf-profile.children.cycles-pp.do_vfs_ioctl
      8.29 ±  4%      +7.4       15.74 ±  6%  perf-profile.children.cycles-pp.apparmor_file_permission
      8.78 ±  4%      +7.9       16.64 ±  5%  perf-profile.children.cycles-pp.security_file_permission
      3.30 ±  2%      +8.7       11.96 ±  3%  perf-profile.children.cycles-pp.ioctl_file_clone
      1.68            +8.9       10.55 ±  3%  perf-profile.children.cycles-pp.vfs_clone_file_range
     10.33            -1.3        9.02 ±  3%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
     11.15            -1.2        9.92 ±  2%  perf-profile.self.cycles-pp.ioctl
      7.55 ±  3%      -0.9        6.61        perf-profile.self.cycles-pp.__fget_light
      3.16 ±  4%      -0.5        2.69 ±  2%  perf-profile.self.cycles-pp.do_vfs_ioctl
      2.95 ±  2%      -0.4        2.55 ±  2%  perf-profile.self.cycles-pp.__x64_sys_ioctl
      3.32            -0.4        2.93 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      3.08 ±  2%      -0.4        2.72 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      3.13            -0.4        2.78 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      2.39 ±  2%      -0.3        2.10 ±  2%  perf-profile.self.cycles-pp.ioctl_preallocate
      0.57 ±  2%      -0.3        0.31 ±  9%  perf-profile.self.cycles-pp.do_clone_file_range
      2.02 ±  2%      -0.3        1.77 ±  3%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.54 ±  4%      -0.2        1.29 ±  3%  perf-profile.self.cycles-pp.stress_file_ioctl
      1.83            -0.2        1.62 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      2.32 ±  3%      -0.2        2.13        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.77            -0.2        1.58        perf-profile.self.cycles-pp.__get_user_4
      1.28 ±  2%      -0.2        1.11 ±  4%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      1.76 ±  2%      -0.1        1.62 ±  3%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.25 ±  6%      -0.1        0.12 ±  8%  perf-profile.self.cycles-pp.generic_file_rw_checks
      0.48 ±  2%      -0.1        0.38 ±  4%  perf-profile.self.cycles-pp.ioctl_file_clone
      0.79 ±  3%      -0.1        0.70 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.81 ±  3%      -0.1        0.73 ±  4%  perf-profile.self.cycles-pp.__put_user_4
      0.81 ±  5%      -0.1        0.73 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      0.52 ±  4%      -0.1        0.44 ±  3%  perf-profile.self.cycles-pp.amd_clear_divider
      0.17 ± 11%      -0.1        0.12 ± 10%  perf-profile.self.cycles-pp.stress_mwc8
      0.57 ±  3%      -0.0        0.52 ±  4%  perf-profile.self.cycles-pp.__get_user_2
      0.42 ±  4%      -0.0        0.38 ±  3%  perf-profile.self.cycles-pp.__libc_fcntl64
      0.30 ±  3%      -0.0        0.26 ±  5%  perf-profile.self.cycles-pp.__x64_sys_fcntl
      0.22 ±  5%      -0.0        0.18 ±  6%  perf-profile.self.cycles-pp.do_fcntl
      0.28 ±  3%      -0.0        0.24 ±  2%  perf-profile.self.cycles-pp.__kmem_cache_alloc_node
      0.00            +0.2        0.22 ±  4%  perf-profile.self.cycles-pp.fsnotify_perm
      0.49 ±  3%      +0.4        0.92 ±  2%  perf-profile.self.cycles-pp.security_file_permission
      0.46 ±  2%      +0.5        0.96 ±  2%  perf-profile.self.cycles-pp.aa_file_perm
      0.00            +1.5        1.52 ±  2%  perf-profile.self.cycles-pp.__fsnotify_parent
      7.75 ±  4%      +6.8       14.58 ±  7%  perf-profile.self.cycles-pp.apparmor_file_permission




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2024-01-31 14:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 14:13 kenel test robot [this message]
2024-01-31 15:47 ` [linus:master] [remap_range] dfad37051a: stress-ng.file-ioctl.ops_per_sec -11.2% regression Amir Goldstein
2024-02-02  9:13   ` Amir Goldstein
2024-02-04  6:32     ` Oliver Sang
2024-02-06 15:04       ` Amir Goldstein
2024-02-06 16:08         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202401312229.eddeb9a6-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.