From: kernel test robot <oliver.sang@intel.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Matthew Wilcox <willy@infradead.org>,
kernel test robot <oliver.sang@intel.com>,
<linux-fsdevel@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <fengwei.yin@intel.com>
Subject: [cel:simple-offset-maple] [libfs] a616bc6667: aim9.disk_src.ops_per_sec 11.8% improvement
Date: Mon, 19 Feb 2024 13:44:05 +0800 [thread overview]
Message-ID: <202402191308.8e7ee8c7-oliver.sang@intel.com> (raw)
Hello,
kernel test robot noticed a 11.8% improvement of aim9.disk_src.ops_per_sec on:
commit: a616bc666748063733c62e15ea417a90772a40e0 ("libfs: Convert simple directory offsets to use a Maple Tree")
git://git.kernel.org/cgit/linux/kernel/git/cel/linux simple-offset-maple
testcase: aim9
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
parameters:
testtime: 300s
test: disk_src
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240219/202402191308.8e7ee8c7-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s
commit:
f3f24869a1 ("test_maple_tree: testing the cyclic allocation")
a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree")
f3f24869a1d7cde1 a616bc666748063733c62e15ea4
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.34 ± 4% -0.1 0.20 ± 4% mpstat.cpu.all.soft%
0.00 ± 28% +58.3% 0.00 ± 17% perf-sched.sch_delay.max.ms.ipmi_thread.kthread.ret_from_fork.ret_from_fork_asm
1464 ± 2% +14.0% 1668 ± 4% vmstat.system.cs
164231 +11.8% 183678 aim9.disk_src.ops_per_sec
1309 ± 15% +2643.5% 35915 ± 23% aim9.time.involuntary_context_switches
91.00 +5.5% 96.00 aim9.time.percent_of_cpu_this_job_got
212.54 +3.5% 220.06 aim9.time.system_time
62.58 +10.2% 68.94 aim9.time.user_time
21685 -7.1% 20144 proc-vmstat.nr_slab_reclaimable
6611541 -88.6% 750673 ± 7% proc-vmstat.numa_hit
6561447 -89.3% 700947 ± 7% proc-vmstat.numa_local
5747 +3.7% 5960 proc-vmstat.pgactivate
26113963 -93.7% 1648373 ± 17% proc-vmstat.pgalloc_normal
26042963 -93.7% 1628178 ± 18% proc-vmstat.pgfree
2.07 -1.2% 2.04 perf-stat.i.MPKI
6.738e+08 +3.0% 6.94e+08 perf-stat.i.branch-instructions
2.94 -0.2 2.70 perf-stat.i.branch-miss-rate%
20408670 -5.1% 19363031 perf-stat.i.branch-misses
15.11 +2.7 17.77 perf-stat.i.cache-miss-rate%
46824224 -14.7% 39962840 perf-stat.i.cache-references
1419 ± 2% +14.4% 1623 ± 5% perf-stat.i.context-switches
1.88 -1.3% 1.85 perf-stat.i.cpi
9.453e+08 +2.2% 9.659e+08 perf-stat.i.dTLB-loads
0.22 ± 5% +0.0 0.25 ± 3% perf-stat.i.dTLB-store-miss-rate%
8.8e+08 -6.8% 8.205e+08 perf-stat.i.dTLB-stores
1536484 +7.9% 1657233 perf-stat.i.iTLB-load-misses
2279 -6.0% 2142 perf-stat.i.instructions-per-iTLB-miss
0.54 +1.3% 0.54 perf-stat.i.ipc
786.95 +7.1% 843.12 perf-stat.i.metric.K/sec
47.07 +1.1 48.17 perf-stat.i.node-load-miss-rate%
87561 ± 4% +17.2% 102647 ± 6% perf-stat.i.node-load-misses
2.01 -1.2% 1.99 perf-stat.overall.MPKI
3.03 -0.2 2.79 perf-stat.overall.branch-miss-rate%
15.07 +2.6 17.67 perf-stat.overall.cache-miss-rate%
1.84 -1.2% 1.82 perf-stat.overall.cpi
0.22 ± 5% +0.0 0.24 ± 3% perf-stat.overall.dTLB-store-miss-rate%
2283 -6.1% 2144 perf-stat.overall.instructions-per-iTLB-miss
0.54 +1.2% 0.55 perf-stat.overall.ipc
44.15 +1.8 45.93 perf-stat.overall.node-load-miss-rate%
6.715e+08 +3.0% 6.917e+08 perf-stat.ps.branch-instructions
20340341 -5.1% 19299968 perf-stat.ps.branch-misses
46667379 -14.7% 39829580 perf-stat.ps.cache-references
1414 ± 2% +14.4% 1618 ± 5% perf-stat.ps.context-switches
9.421e+08 +2.2% 9.627e+08 perf-stat.ps.dTLB-loads
8.771e+08 -6.8% 8.178e+08 perf-stat.ps.dTLB-stores
1531338 +7.9% 1651678 perf-stat.ps.iTLB-load-misses
87275 ± 4% +17.3% 102341 ± 6% perf-stat.ps.node-load-misses
5.62 ± 13% -1.9 3.69 ± 12% perf-profile.calltrace.cycles-pp.shmem_mknod.lookup_open.open_last_lookups.path_openat.do_filp_open
7.87 ± 13% -1.9 5.95 ± 11% perf-profile.calltrace.cycles-pp.lookup_open.open_last_lookups.path_openat.do_filp_open.do_sys_openat2
8.47 ± 13% -1.9 6.59 ± 10% perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
2.97 ± 12% -1.8 1.16 ± 13% perf-profile.calltrace.cycles-pp.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups.path_openat
0.00 +1.0 0.98 ± 13% perf-profile.calltrace.cycles-pp.mas_alloc_cyclic.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open
0.00 +1.0 1.00 ± 40% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn
0.00 +1.0 1.03 ± 40% perf-profile.calltrace.cycles-pp.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread
0.00 +1.1 1.06 ± 40% perf-profile.calltrace.cycles-pp.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
0.00 +1.1 1.06 ± 40% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.00 +1.1 1.10 ± 39% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.00 +1.1 1.10 ± 14% perf-profile.calltrace.cycles-pp.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups
0.00 +1.2 1.20 ± 13% perf-profile.calltrace.cycles-pp.mas_erase.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink
0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
0.00 +1.4 1.35 ± 12% perf-profile.calltrace.cycles-pp.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat
15.22 ± 8% -2.8 12.40 ± 8% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
14.50 ± 8% -2.8 11.72 ± 8% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
4.73 ± 13% -2.8 1.97 ± 15% perf-profile.children.cycles-pp.irq_exit_rcu
3.50 ± 12% -2.1 1.41 ± 12% perf-profile.children.cycles-pp.kmem_cache_alloc_lru
5.63 ± 13% -1.9 3.70 ± 12% perf-profile.children.cycles-pp.shmem_mknod
7.88 ± 13% -1.9 5.97 ± 11% perf-profile.children.cycles-pp.lookup_open
8.49 ± 13% -1.9 6.62 ± 10% perf-profile.children.cycles-pp.open_last_lookups
2.97 ± 12% -1.8 1.16 ± 13% perf-profile.children.cycles-pp.simple_offset_add
2.90 ± 22% -1.8 1.15 ± 41% perf-profile.children.cycles-pp.rcu_do_batch
4.47 ± 14% -1.7 2.76 ± 24% perf-profile.children.cycles-pp.__do_softirq
1.85 ± 15% -1.7 0.14 ± 28% perf-profile.children.cycles-pp.___slab_alloc
3.00 ± 22% -1.7 1.34 ± 38% perf-profile.children.cycles-pp.rcu_core
1.66 ± 15% -1.6 0.05 ± 68% perf-profile.children.cycles-pp.allocate_slab
0.92 ± 18% -0.6 0.31 ± 19% perf-profile.children.cycles-pp.__call_rcu_common
0.88 ± 27% -0.6 0.31 ± 43% perf-profile.children.cycles-pp.__slab_free
0.28 ± 15% -0.2 0.12 ± 25% perf-profile.children.cycles-pp.xas_load
0.20 ± 18% -0.1 0.08 ± 30% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.12 ± 30% -0.1 0.05 ± 65% perf-profile.children.cycles-pp.rcu_nocb_try_bypass
0.00 +0.1 0.10 ± 27% perf-profile.children.cycles-pp.mas_wr_end_piv
0.00 +0.2 0.17 ± 22% perf-profile.children.cycles-pp.mas_leaf_max_gap
0.00 +0.2 0.18 ± 24% perf-profile.children.cycles-pp.mtree_range_walk
0.00 +0.2 0.24 ± 22% perf-profile.children.cycles-pp.mas_anode_descend
0.00 +0.3 0.29 ± 16% perf-profile.children.cycles-pp.mas_wr_walk
0.00 +0.3 0.31 ± 23% perf-profile.children.cycles-pp.mas_update_gap
0.00 +0.3 0.32 ± 17% perf-profile.children.cycles-pp.mas_wr_append
0.00 +0.4 0.37 ± 15% perf-profile.children.cycles-pp.mas_empty_area
0.00 +0.5 0.47 ± 18% perf-profile.children.cycles-pp.mas_wr_node_store
0.00 +1.0 0.99 ± 13% perf-profile.children.cycles-pp.mas_alloc_cyclic
0.05 ± 82% +1.0 1.10 ± 39% perf-profile.children.cycles-pp.smpboot_thread_fn
0.01 ±264% +1.0 1.06 ± 40% perf-profile.children.cycles-pp.run_ksoftirqd
0.22 ± 36% +1.1 1.28 ± 38% perf-profile.children.cycles-pp.ret_from_fork
0.22 ± 36% +1.1 1.28 ± 38% perf-profile.children.cycles-pp.ret_from_fork_asm
0.21 ± 38% +1.1 1.27 ± 38% perf-profile.children.cycles-pp.kthread
0.00 +1.1 1.11 ± 14% perf-profile.children.cycles-pp.mtree_alloc_cyclic
0.00 +1.2 1.21 ± 14% perf-profile.children.cycles-pp.mas_erase
0.00 +1.4 1.35 ± 12% perf-profile.children.cycles-pp.mtree_erase
0.87 ± 27% -0.6 0.31 ± 42% perf-profile.self.cycles-pp.__slab_free
0.53 ± 19% -0.4 0.18 ± 23% perf-profile.self.cycles-pp.__call_rcu_common
0.57 ± 10% -0.3 0.26 ± 21% perf-profile.self.cycles-pp.kmem_cache_alloc_lru
0.89 ± 14% -0.3 0.59 ± 15% perf-profile.self.cycles-pp.kmem_cache_free
0.19 ± 21% -0.1 0.06 ± 65% perf-profile.self.cycles-pp.rcu_segcblist_enqueue
0.10 ± 20% -0.1 0.04 ± 81% perf-profile.self.cycles-pp.xas_load
0.08 ± 19% -0.0 0.04 ± 61% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
0.00 +0.1 0.09 ± 30% perf-profile.self.cycles-pp.mtree_erase
0.00 +0.1 0.10 ± 26% perf-profile.self.cycles-pp.mtree_alloc_cyclic
0.00 +0.1 0.10 ± 27% perf-profile.self.cycles-pp.mas_wr_end_piv
0.00 +0.1 0.12 ± 38% perf-profile.self.cycles-pp.mas_empty_area
0.00 +0.1 0.14 ± 38% perf-profile.self.cycles-pp.mas_update_gap
0.00 +0.1 0.14 ± 20% perf-profile.self.cycles-pp.mas_wr_append
0.00 +0.2 0.16 ± 23% perf-profile.self.cycles-pp.mas_leaf_max_gap
0.00 +0.2 0.18 ± 24% perf-profile.self.cycles-pp.mtree_range_walk
0.00 +0.2 0.18 ± 29% perf-profile.self.cycles-pp.mas_alloc_cyclic
0.00 +0.2 0.22 ± 32% perf-profile.self.cycles-pp.mas_erase
0.00 +0.2 0.24 ± 22% perf-profile.self.cycles-pp.mas_anode_descend
0.00 +0.3 0.27 ± 16% perf-profile.self.cycles-pp.mas_wr_walk
0.00 +0.3 0.34 ± 20% perf-profile.self.cycles-pp.mas_wr_node_store
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2024-02-19 5:44 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-19 5:44 kernel test robot [this message]
2024-02-19 13:52 ` [cel:simple-offset-maple] [libfs] a616bc6667: aim9.disk_src.ops_per_sec 11.8% improvement Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202402191308.8e7ee8c7-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=chuck.lever@oracle.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox