From: kernel test robot <oliver.sang@intel.com>
To: David Howells <dhowells@redhat.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>,
Christian Brauner <brauner@kernel.org>,
Steve French <sfrench@samba.org>,
Paulo Alcantara <pc@manguebit.com>,
Trond Myklebust <trondmy@kernel.org>,
Jeff Layton <jlayton@kernel.org>, <netfs@lists.linux.dev>,
<linux-fsdevel@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [linus:master] [netfs] d6a77668a7: filebench.sum_operations/s 158.3% improvement
Date: Tue, 26 Nov 2024 16:44:23 +0800 [thread overview]
Message-ID: <202411261616.c29946d8-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 158.3% improvement of filebench.sum_operations/s on:
commit: d6a77668a708f0b5ca6713b39c178c9d9563c35b ("netfs: Downgrade i_rwsem for a buffered write")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: filebench
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:
disk: 1HDD
fs: xfs
fs2: cifs
test: randomrw.f
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241126/202411261616.c29946d8-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/1HDD/cifs/xfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/randomrw.f/filebench
commit:
6ed469df0b ("nilfs2: fix kernel bug due to missing clearing of buffer delay flag")
d6a77668a7 ("netfs: Downgrade i_rwsem for a buffered write")
6ed469df0bfbef3e d6a77668a708f0b5ca6713b39c1
---------------- ---------------------------
%stddev %change %stddev
\ | \
10356023 ± 13% -88.4% 1203898 ± 8% cpuidle..usage
1862 ± 17% -45.6% 1013 ± 23% perf-c2c.HITM.local
564994 ± 9% -86.4% 76928 ± 36% numa-meminfo.node1.Active(anon)
585171 ± 7% -84.9% 88374 ± 38% numa-meminfo.node1.Shmem
124475 ± 13% -92.9% 8821 ± 14% vmstat.system.cs
9926 ± 6% -39.6% 5995 ± 4% vmstat.system.in
576365 ± 10% -83.0% 98054 ± 27% meminfo.Active(anon)
1481440 ± 4% -33.1% 991806 ± 2% meminfo.Committed_AS
613566 ± 10% -79.8% 124007 ± 22% meminfo.Shmem
0.02 ± 3% -0.0 0.02 ± 4% mpstat.cpu.all.irq%
0.60 ± 2% +0.1 0.69 mpstat.cpu.all.sys%
0.18 +0.0 0.22 ± 6% mpstat.cpu.all.usr%
141224 ± 9% -86.4% 19203 ± 36% numa-vmstat.node1.nr_active_anon
146313 ± 7% -84.9% 22087 ± 38% numa-vmstat.node1.nr_shmem
141224 ± 9% -86.4% 19203 ± 36% numa-vmstat.node1.nr_zone_active_anon
91197 ± 22% -93.7% 5768 ± 19% sched_debug.cpu.nr_switches.avg
6021808 ± 30% -96.1% 232641 ± 32% sched_debug.cpu.nr_switches.max
616189 ± 24% -95.9% 25525 ± 31% sched_debug.cpu.nr_switches.stddev
144168 ± 10% -83.0% 24516 ± 27% proc-vmstat.nr_active_anon
3501815 -3.8% 3369305 proc-vmstat.nr_file_pages
28035 -5.9% 26386 proc-vmstat.nr_mapped
153431 ± 10% -79.8% 31026 ± 22% proc-vmstat.nr_shmem
25506 -1.6% 25092 proc-vmstat.nr_slab_reclaimable
144168 ± 10% -83.0% 24516 ± 27% proc-vmstat.nr_zone_active_anon
1443064 -7.1% 1340212 proc-vmstat.pgactivate
2557 ± 14% +158.3% 6606 ± 10% filebench.sum_bytes_mb/s
19644866 ± 14% +158.3% 50742596 ± 10% filebench.sum_operations
327385 ± 14% +158.3% 845638 ± 10% filebench.sum_operations/s
163882 ± 14% +189.5% 474419 ± 12% filebench.sum_reads/s
0.01 ± 15% -65.7% 0.00 filebench.sum_time_ms/op
163502 ± 14% +127.0% 371220 ± 9% filebench.sum_writes/s
56.83 +29.0% 73.33 filebench.time.percent_of_cpu_this_job_got
85.87 ± 2% +20.1% 103.10 ± 2% filebench.time.system_time
8.54 ± 10% +115.4% 18.39 ± 16% filebench.time.user_time
9795275 ± 14% -99.3% 67709 ± 70% filebench.time.voluntary_context_switches
0.01 ± 29% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.01 ± 19% -100.0% 0.00 perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.00 ± 67% +469.2% 0.01 ± 12% perf-sched.total_sch_delay.average.ms
1.33 ± 13% +975.0% 14.30 ± 33% perf-sched.total_wait_and_delay.average.ms
724911 ± 10% -89.6% 75232 ± 36% perf-sched.total_wait_and_delay.count.ms
1.33 ± 13% +976.1% 14.29 ± 33% perf-sched.total_wait_time.average.ms
3.47 ± 11% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
54.35 ± 8% +403.1% 273.44 ± 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
19.50 ± 30% -100.0% 0.00 perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
280.83 ± 12% -79.1% 58.83 ± 24% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
649458 ± 10% -99.1% 6085 ± 56% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
4.62 ± 11% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
1001 +25.4% 1254 ± 17% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.01 ± 22% -100.0% 0.00 perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
54.34 ± 8% +403.2% 273.41 ± 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.01 ± 19% -100.0% 0.00 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
1001 +25.4% 1254 ± 17% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.15 ± 44% -69.5% 0.05 ± 28% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_interruptible.netfs_start_io_read
3.23 ±100% -1.0 2.18 ±142% perf-profile.calltrace.cycles-pp.cmd_stat
3.23 ±100% -1.0 2.18 ±142% perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat
3.22 ±100% -1.0 2.17 ±141% perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat
3.12 ±100% -1.0 2.12 ±142% perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat
0.42 ± 34% -0.2 0.24 ± 28% perf-profile.children.cycles-pp.perf_iterate_sb
0.42 ± 22% -0.1 0.28 ± 22% perf-profile.children.cycles-pp.set_pte_range
0.11 ± 38% -0.1 0.04 ± 71% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
0.11 ± 56% -0.1 0.04 ± 71% perf-profile.children.cycles-pp.read@plt
0.02 ±141% +0.1 0.12 ± 29% perf-profile.children.cycles-pp.aa_file_perm
0.07 ± 55% +0.1 0.17 ± 29% perf-profile.children.cycles-pp.fault_in_iov_iter_readable
0.07 ± 55% +0.1 0.17 ± 29% perf-profile.children.cycles-pp.fault_in_readable
0.09 ± 50% +0.1 0.22 ± 28% perf-profile.children.cycles-pp.getenv
0.21 ± 30% +0.2 0.37 ± 34% perf-profile.children.cycles-pp.__perf_read_group_add
0.19 ± 44% +0.2 0.36 ± 34% perf-profile.children.cycles-pp.pcpu_alloc_noprof
0.82 ± 11% +0.4 1.20 ± 13% perf-profile.children.cycles-pp.sched_balance_update_blocked_averages
0.11 ± 38% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
0.11 ± 56% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.read@plt
0.02 ±141% +0.1 0.12 ± 29% perf-profile.self.cycles-pp.aa_file_perm
0.02 ±141% +0.1 0.12 ± 31% perf-profile.self.cycles-pp.getenv
5.49 ± 4% +87.1% 10.27 ± 4% perf-stat.i.MPKI
6.113e+08 ± 6% -21.6% 4.793e+08 ± 5% perf-stat.i.branch-instructions
12875097 -9.6% 11640297 perf-stat.i.branch-misses
26605878 ± 8% +61.4% 42952527 ± 5% perf-stat.i.cache-misses
89659393 ± 6% +53.9% 1.38e+08 ± 6% perf-stat.i.cache-references
126410 ± 13% -93.0% 8884 ± 15% perf-stat.i.context-switches
1.85 ± 2% +8.3% 2.00 ± 2% perf-stat.i.cpi
2.757e+09 ± 6% -17.8% 2.265e+09 ± 4% perf-stat.i.instructions
0.58 ± 2% -8.1% 0.53 ± 2% perf-stat.i.ipc
1.00 ± 13% -97.3% 0.03 ± 57% perf-stat.i.metric.K/sec
9.63 ± 4% +96.7% 18.95 ± 2% perf-stat.overall.MPKI
2.11 ± 5% +0.3 2.42 ± 5% perf-stat.overall.branch-miss-rate%
1.56 ± 6% +19.7% 1.86 ± 4% perf-stat.overall.cpi
161.89 ± 7% -39.3% 98.29 ± 4% perf-stat.overall.cycles-between-cache-misses
0.65 ± 5% -16.5% 0.54 ± 5% perf-stat.overall.ipc
6.088e+08 ± 6% -21.3% 4.791e+08 ± 5% perf-stat.ps.branch-instructions
12794450 -9.6% 11566995 perf-stat.ps.branch-misses
26464019 ± 8% +62.1% 42902925 ± 4% perf-stat.ps.cache-misses
89144844 ± 7% +54.5% 1.378e+08 ± 6% perf-stat.ps.cache-references
126023 ± 13% -93.0% 8808 ± 15% perf-stat.ps.context-switches
2.746e+09 ± 6% -17.5% 2.264e+09 ± 4% perf-stat.ps.instructions
4.542e+11 ± 6% -17.4% 3.753e+11 ± 4% perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2024-11-26 8:44 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202411261616.c29946d8-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=netfs@lists.linux.dev \
--cc=oe-lkp@lists.linux.dev \
--cc=pc@manguebit.com \
--cc=sfrench@samba.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.