All of lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [netfs]  d6a77668a7:  filebench.sum_operations/s 158.3% improvement
@ 2024-11-26  8:44 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2024-11-26  8:44 UTC (permalink / raw)
  To: David Howells
  Cc: oe-lkp, lkp, linux-kernel, Christian Brauner, Steve French,
	Paulo Alcantara, Trond Myklebust, Jeff Layton, netfs,
	linux-fsdevel, oliver.sang



Hello,

kernel test robot noticed a 158.3% improvement of filebench.sum_operations/s on:


commit: d6a77668a708f0b5ca6713b39c178c9d9563c35b ("netfs: Downgrade i_rwsem for a buffered write")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: filebench
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:

	disk: 1HDD
	fs: xfs
	fs2: cifs
	test: randomrw.f
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241126/202411261616.c29946d8-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/performance/1HDD/cifs/xfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/randomrw.f/filebench

commit: 
  6ed469df0b ("nilfs2: fix kernel bug due to missing clearing of buffer delay flag")
  d6a77668a7 ("netfs: Downgrade i_rwsem for a buffered write")

6ed469df0bfbef3e d6a77668a708f0b5ca6713b39c1 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  10356023 ± 13%     -88.4%    1203898 ±  8%  cpuidle..usage
      1862 ± 17%     -45.6%       1013 ± 23%  perf-c2c.HITM.local
    564994 ±  9%     -86.4%      76928 ± 36%  numa-meminfo.node1.Active(anon)
    585171 ±  7%     -84.9%      88374 ± 38%  numa-meminfo.node1.Shmem
    124475 ± 13%     -92.9%       8821 ± 14%  vmstat.system.cs
      9926 ±  6%     -39.6%       5995 ±  4%  vmstat.system.in
    576365 ± 10%     -83.0%      98054 ± 27%  meminfo.Active(anon)
   1481440 ±  4%     -33.1%     991806 ±  2%  meminfo.Committed_AS
    613566 ± 10%     -79.8%     124007 ± 22%  meminfo.Shmem
      0.02 ±  3%      -0.0        0.02 ±  4%  mpstat.cpu.all.irq%
      0.60 ±  2%      +0.1        0.69        mpstat.cpu.all.sys%
      0.18            +0.0        0.22 ±  6%  mpstat.cpu.all.usr%
    141224 ±  9%     -86.4%      19203 ± 36%  numa-vmstat.node1.nr_active_anon
    146313 ±  7%     -84.9%      22087 ± 38%  numa-vmstat.node1.nr_shmem
    141224 ±  9%     -86.4%      19203 ± 36%  numa-vmstat.node1.nr_zone_active_anon
     91197 ± 22%     -93.7%       5768 ± 19%  sched_debug.cpu.nr_switches.avg
   6021808 ± 30%     -96.1%     232641 ± 32%  sched_debug.cpu.nr_switches.max
    616189 ± 24%     -95.9%      25525 ± 31%  sched_debug.cpu.nr_switches.stddev
    144168 ± 10%     -83.0%      24516 ± 27%  proc-vmstat.nr_active_anon
   3501815            -3.8%    3369305        proc-vmstat.nr_file_pages
     28035            -5.9%      26386        proc-vmstat.nr_mapped
    153431 ± 10%     -79.8%      31026 ± 22%  proc-vmstat.nr_shmem
     25506            -1.6%      25092        proc-vmstat.nr_slab_reclaimable
    144168 ± 10%     -83.0%      24516 ± 27%  proc-vmstat.nr_zone_active_anon
   1443064            -7.1%    1340212        proc-vmstat.pgactivate
      2557 ± 14%    +158.3%       6606 ± 10%  filebench.sum_bytes_mb/s
  19644866 ± 14%    +158.3%   50742596 ± 10%  filebench.sum_operations
    327385 ± 14%    +158.3%     845638 ± 10%  filebench.sum_operations/s
    163882 ± 14%    +189.5%     474419 ± 12%  filebench.sum_reads/s
      0.01 ± 15%     -65.7%       0.00        filebench.sum_time_ms/op
    163502 ± 14%    +127.0%     371220 ±  9%  filebench.sum_writes/s
     56.83           +29.0%      73.33        filebench.time.percent_of_cpu_this_job_got
     85.87 ±  2%     +20.1%     103.10 ±  2%  filebench.time.system_time
      8.54 ± 10%    +115.4%      18.39 ± 16%  filebench.time.user_time
   9795275 ± 14%     -99.3%      67709 ± 70%  filebench.time.voluntary_context_switches
      0.01 ± 29%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.01 ± 19%    -100.0%       0.00        perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.00 ± 67%    +469.2%       0.01 ± 12%  perf-sched.total_sch_delay.average.ms
      1.33 ± 13%    +975.0%      14.30 ± 33%  perf-sched.total_wait_and_delay.average.ms
    724911 ± 10%     -89.6%      75232 ± 36%  perf-sched.total_wait_and_delay.count.ms
      1.33 ± 13%    +976.1%      14.29 ± 33%  perf-sched.total_wait_time.average.ms
      3.47 ± 11%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     54.35 ±  8%    +403.1%     273.44 ± 19%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
     19.50 ± 30%    -100.0%       0.00        perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    280.83 ± 12%     -79.1%      58.83 ± 24%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
    649458 ± 10%     -99.1%       6085 ± 56%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
      4.62 ± 11%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      1001           +25.4%       1254 ± 17%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.01 ± 22%    -100.0%       0.00        perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
     54.34 ±  8%    +403.2%     273.41 ± 19%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.01 ± 19%    -100.0%       0.00        perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      1001           +25.4%       1254 ± 17%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.15 ± 44%     -69.5%       0.05 ± 28%  perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
      3.23 ±100%      -1.0        2.18 ±142%  perf-profile.calltrace.cycles-pp.cmd_stat
      3.23 ±100%      -1.0        2.18 ±142%  perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat
      3.22 ±100%      -1.0        2.17 ±141%  perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat
      3.12 ±100%      -1.0        2.12 ±142%  perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat
      0.42 ± 34%      -0.2        0.24 ± 28%  perf-profile.children.cycles-pp.perf_iterate_sb
      0.42 ± 22%      -0.1        0.28 ± 22%  perf-profile.children.cycles-pp.set_pte_range
      0.11 ± 38%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      0.11 ± 56%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.read@plt
      0.02 ±141%      +0.1        0.12 ± 29%  perf-profile.children.cycles-pp.aa_file_perm
      0.07 ± 55%      +0.1        0.17 ± 29%  perf-profile.children.cycles-pp.fault_in_iov_iter_readable
      0.07 ± 55%      +0.1        0.17 ± 29%  perf-profile.children.cycles-pp.fault_in_readable
      0.09 ± 50%      +0.1        0.22 ± 28%  perf-profile.children.cycles-pp.getenv
      0.21 ± 30%      +0.2        0.37 ± 34%  perf-profile.children.cycles-pp.__perf_read_group_add
      0.19 ± 44%      +0.2        0.36 ± 34%  perf-profile.children.cycles-pp.pcpu_alloc_noprof
      0.82 ± 11%      +0.4        1.20 ± 13%  perf-profile.children.cycles-pp.sched_balance_update_blocked_averages
      0.11 ± 38%      -0.1        0.04 ± 71%  perf-profile.self.cycles-pp.copy_page_from_iter_atomic
      0.11 ± 56%      -0.1        0.04 ± 71%  perf-profile.self.cycles-pp.read@plt
      0.02 ±141%      +0.1        0.12 ± 29%  perf-profile.self.cycles-pp.aa_file_perm
      0.02 ±141%      +0.1        0.12 ± 31%  perf-profile.self.cycles-pp.getenv
      5.49 ±  4%     +87.1%      10.27 ±  4%  perf-stat.i.MPKI
 6.113e+08 ±  6%     -21.6%  4.793e+08 ±  5%  perf-stat.i.branch-instructions
  12875097            -9.6%   11640297        perf-stat.i.branch-misses
  26605878 ±  8%     +61.4%   42952527 ±  5%  perf-stat.i.cache-misses
  89659393 ±  6%     +53.9%   1.38e+08 ±  6%  perf-stat.i.cache-references
    126410 ± 13%     -93.0%       8884 ± 15%  perf-stat.i.context-switches
      1.85 ±  2%      +8.3%       2.00 ±  2%  perf-stat.i.cpi
 2.757e+09 ±  6%     -17.8%  2.265e+09 ±  4%  perf-stat.i.instructions
      0.58 ±  2%      -8.1%       0.53 ±  2%  perf-stat.i.ipc
      1.00 ± 13%     -97.3%       0.03 ± 57%  perf-stat.i.metric.K/sec
      9.63 ±  4%     +96.7%      18.95 ±  2%  perf-stat.overall.MPKI
      2.11 ±  5%      +0.3        2.42 ±  5%  perf-stat.overall.branch-miss-rate%
      1.56 ±  6%     +19.7%       1.86 ±  4%  perf-stat.overall.cpi
    161.89 ±  7%     -39.3%      98.29 ±  4%  perf-stat.overall.cycles-between-cache-misses
      0.65 ±  5%     -16.5%       0.54 ±  5%  perf-stat.overall.ipc
 6.088e+08 ±  6%     -21.3%  4.791e+08 ±  5%  perf-stat.ps.branch-instructions
  12794450            -9.6%   11566995        perf-stat.ps.branch-misses
  26464019 ±  8%     +62.1%   42902925 ±  4%  perf-stat.ps.cache-misses
  89144844 ±  7%     +54.5%  1.378e+08 ±  6%  perf-stat.ps.cache-references
    126023 ± 13%     -93.0%       8808 ± 15%  perf-stat.ps.context-switches
 2.746e+09 ±  6%     -17.5%  2.264e+09 ±  4%  perf-stat.ps.instructions
 4.542e+11 ±  6%     -17.4%  3.753e+11 ±  4%  perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-11-26  8:44 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26  8:44 [linus:master] [netfs] d6a77668a7: filebench.sum_operations/s 158.3% improvement kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.