From: kernel test robot <oliver.sang@intel.com>
To: Huang Ying <ying.huang@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Vlastimil Babka <vbabka@suse.cz>,
"David Hildenbrand" <david@redhat.com>,
Johannes Weiner <jweiner@redhat.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
Michal Hocko <mhocko@suse.com>,
"Pavel Tatashin" <pasha.tatashin@soleen.com>,
Matthew Wilcox <willy@infradead.org>,
Christoph Lameter <cl@linux.com>,
Arjan van de Ven <arjan@linux.intel.com>,
Sudeep Holla <sudeep.holla@arm.com>, <linux-mm@kvack.org>,
<ying.huang@intel.com>, <feng.tang@intel.com>,
<fengwei.yin@intel.com>, <oliver.sang@intel.com>
Subject: [linus:master] [mm, pcp] 6ccdcb6d3a: stress-ng.judy.ops_per_sec -4.7% regression
Date: Thu, 23 Nov 2023 13:03:34 +0800 [thread overview]
Message-ID: <202311231029.3aa790-oliver.sang@intel.com> (raw)
Hello,
kernel test robot noticed a -4.7% regression of stress-ng.judy.ops_per_sec on:
commit: 6ccdcb6d3a741c4e005ca6ffd4a62ddf8b5bead3 ("mm, pcp: reduce detecting time of consecutive high order page freeing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
nr_threads: 100%
testtime: 60s
class: cpu-cache
test: judy
disk: 1SSD
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | lmbench3: lmbench3.TCP.socket.bandwidth.10MB.MB/sec 23.7% improvement |
| test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory |
| test parameters | cpufreq_governor=performance |
| | mode=development |
| | nr_threads=100% |
| | test=TCP |
| | test_memory_size=50% |
+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.file-ioctl.ops_per_sec -6.6% regression |
| test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory |
| test parameters | class=filesystem |
| | cpufreq_governor=performance |
| | disk=1SSD |
| | fs=btrfs |
| | nr_threads=10% |
| | test=file-ioctl |
| | testtime=60s |
+------------------+-------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202311231029.3aa790-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231123/202311231029.3aa790-oliver.sang@intel.com
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
cpu-cache/gcc-12/performance/1SSD/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp4/judy/stress-ng/60s
commit:
57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")
57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.57 ± 5% +46.8% 6.71 ± 17% iostat.cpu.system
2842 +1.0% 2871 turbostat.Bzy_MHz
0.12 ± 3% +0.4 0.55 ± 26% mpstat.cpu.all.soft%
3.05 ± 6% +1.8 4.86 ± 20% mpstat.cpu.all.sys%
81120642 -2.9% 78746159 proc-vmstat.numa_hit
80886548 -2.9% 78513494 proc-vmstat.numa_local
82771023 -2.9% 80399459 proc-vmstat.pgalloc_normal
82356596 -2.9% 79991041 proc-vmstat.pgfree
12325708 ± 3% +5.3% 12974746 perf-stat.i.dTLB-load-misses
0.38 ± 44% +27.2% 0.48 perf-stat.overall.cpi
668.74 ± 44% +24.7% 834.02 perf-stat.overall.cycles-between-cache-misses
0.00 ± 45% +0.0 0.01 ± 10% perf-stat.overall.dTLB-load-miss-rate%
10040254 ± 44% +26.0% 12650801 perf-stat.ps.dTLB-load-misses
7036371 ± 3% -2.8% 6842720 stress-ng.judy.Judy_delete_operations_per_sec
9244466 ± 3% -7.8% 8524505 ± 3% stress-ng.judy.Judy_insert_operations_per_sec
2912 ± 3% -4.7% 2774 stress-ng.judy.ops_per_sec
13316 ± 8% +22.8% 16355 ± 13% stress-ng.time.maximum_resident_set_size
445.86 ± 5% +64.2% 732.21 ± 15% stress-ng.time.system_time
40885 ± 40% +373.8% 193712 ± 11% sched_debug.cfs_rq:/.left_vruntime.avg
465264 ± 31% +142.5% 1128399 ± 5% sched_debug.cfs_rq:/.left_vruntime.stddev
8322 ± 34% +140.8% 20039 ± 17% sched_debug.cfs_rq:/.load.avg
40886 ± 40% +373.8% 193713 ± 11% sched_debug.cfs_rq:/.right_vruntime.avg
465274 ± 31% +142.5% 1128401 ± 5% sched_debug.cfs_rq:/.right_vruntime.stddev
818.77 ± 10% +43.3% 1172 ± 5% sched_debug.cpu.curr->pid.stddev
0.05 ± 74% +659.6% 0.41 ± 35% perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
0.10 ± 48% +140.3% 0.24 ± 11% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.01 ± 14% +102.6% 0.03 ± 29% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.05 ±122% +1322.6% 0.65 ± 20% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
1.70 ± 79% +729.3% 14.10 ± 48% perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1.08 ±101% +233.4% 3.60 ± 7% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
0.01 ± 8% +54.7% 0.02 ± 18% perf-sched.total_sch_delay.average.ms
0.18 ± 5% +555.7% 1.20 ± 38% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.21 ± 4% +524.6% 1.29 ± 47% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
235.65 ± 31% -57.0% 101.40 ± 17% perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
127.50 ±100% +126.3% 288.50 ± 9% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
125.83 ±144% +407.2% 638.17 ± 27% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
344.50 ± 36% +114.6% 739.33 ± 24% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
0.92 ±114% +482.2% 5.38 ± 47% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
3.22 ± 89% +223.9% 10.44 ± 50% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
0.18 ± 43% +471.8% 1.01 ± 36% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page
34.39 ± 46% +88.8% 64.95 ± 18% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.21 ± 13% +813.6% 1.95 ± 38% perf-sched.wait_time.avg.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.constprop
0.18 ± 15% +457.1% 1.02 ± 58% perf-sched.wait_time.avg.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0
417.61 ± 68% -87.6% 51.85 ±146% perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.22 ± 25% +614.2% 1.57 ± 71% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
0.18 ± 5% +556.3% 1.20 ± 38% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.21 ± 4% +524.6% 1.29 ± 47% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
38.72 ± 39% -53.1% 18.17 ± 30% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
235.60 ± 31% -57.0% 101.37 ± 17% perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
2.17 ± 30% +45.3% 3.16 ± 13% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
1.02 ±131% +574.3% 6.90 ± 52% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page
0.18 ±191% +92359.0% 169.05 ±219% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
69.64 ± 44% +33.2% 92.76 ± 4% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.64 ± 67% +653.6% 4.82 ± 54% perf-sched.wait_time.max.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0
1.75 ± 49% +206.5% 5.38 ± 47% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
3.22 ± 89% +223.9% 10.44 ± 50% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
***************************************************************************************************
lkp-ivb-2ep1: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
gcc-12/performance/x86_64-rhel-8.3/development/100%/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/TCP/50%/lmbench3
commit:
57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")
57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.07 ± 38% +105.0% 0.14 ± 32% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open
26.75 -4.9% 25.45 turbostat.RAMWatt
678809 +7.2% 727594 ± 2% vmstat.system.cs
97929782 -13.1% 85054266 numa-numastat.node0.local_node
97933343 -13.1% 85056081 numa-numastat.node0.numa_hit
97933344 -13.1% 85055901 numa-vmstat.node0.numa_hit
97929783 -13.1% 85054086 numa-vmstat.node0.numa_local
32188 +23.7% 39813 lmbench3.TCP.socket.bandwidth.10MB.MB/sec
652.63 -4.4% 624.04 lmbench3.time.elapsed_time
652.63 -4.4% 624.04 lmbench3.time.elapsed_time.max
8597 -5.9% 8092 lmbench3.time.system_time
0.88 ± 7% -0.1 0.76 ± 5% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.71 ± 10% -0.1 0.61 ± 7% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.78 ± 3% -0.1 0.70 ± 6% perf-profile.children.cycles-pp.security_socket_recvmsg
0.36 ± 9% +0.1 0.42 ± 11% perf-profile.children.cycles-pp.skb_page_frag_refill
0.40 ± 10% +0.1 0.48 ± 12% perf-profile.children.cycles-pp.sk_page_frag_refill
0.51 ± 4% -0.1 0.44 ± 13% perf-profile.self.cycles-pp.sock_read_iter
0.36 ± 10% +0.1 0.42 ± 11% perf-profile.self.cycles-pp.skb_page_frag_refill
158897 ± 2% -6.8% 148107 proc-vmstat.nr_anon_pages
160213 ± 2% -6.8% 149290 proc-vmstat.nr_inactive_anon
160213 ± 2% -6.8% 149290 proc-vmstat.nr_zone_inactive_anon
1.715e+08 -7.1% 1.593e+08 proc-vmstat.numa_hit
1.715e+08 -7.1% 1.592e+08 proc-vmstat.numa_local
1.367e+09 -7.1% 1.27e+09 proc-vmstat.pgalloc_normal
2324641 -2.7% 2261187 proc-vmstat.pgfault
1.367e+09 -7.1% 1.27e+09 proc-vmstat.pgfree
77011 -4.4% 73597 proc-vmstat.pgreuse
5.99 ± 3% -29.9% 4.20 ± 4% perf-stat.i.MPKI
7.914e+09 ± 2% +4.5% 8.271e+09 perf-stat.i.branch-instructions
1.51e+08 +4.6% 1.579e+08 perf-stat.i.branch-misses
7.65 ± 4% -0.9 6.73 ± 3% perf-stat.i.cache-miss-rate%
66394790 ± 2% -21.9% 51865866 ± 3% perf-stat.i.cache-misses
682132 +7.2% 731279 ± 2% perf-stat.i.context-switches
4.01 -16.0% 3.37 perf-stat.i.cpi
71772 ± 4% +11.5% 80055 ± 8% perf-stat.i.cycles-between-cache-misses
9.368e+09 ± 2% +3.6% 9.706e+09 perf-stat.i.dTLB-stores
33695419 ± 2% +7.1% 36096466 ± 2% perf-stat.i.iTLB-load-misses
573897 ± 35% -38.6% 352477 ± 19% perf-stat.i.iTLB-loads
4.09e+10 ± 2% +4.5% 4.273e+10 perf-stat.i.instructions
0.37 +4.3% 0.39 perf-stat.i.ipc
0.09 ± 22% -44.0% 0.05 ± 26% perf-stat.i.major-faults
490.16 ± 2% -8.6% 448.21 ± 2% perf-stat.i.metric.K/sec
635.38 ± 2% +3.5% 657.46 perf-stat.i.metric.M/sec
37.54 +2.3 39.84 perf-stat.i.node-load-miss-rate%
8300835 ± 2% -10.8% 7406820 ± 2% perf-stat.i.node-load-misses
76993977 ± 3% -6.6% 71936169 ± 3% perf-stat.i.node-loads
26.58 ± 4% +4.1 30.71 ± 3% perf-stat.i.node-store-miss-rate%
2341211 ± 4% -29.6% 1648802 ± 3% perf-stat.i.node-store-misses
34198780 ± 3% -33.2% 22857201 ± 3% perf-stat.i.node-stores
1.63 -25.5% 1.21 ± 3% perf-stat.overall.MPKI
10.67 -2.3 8.36 perf-stat.overall.cache-miss-rate%
2.83 -5.2% 2.69 perf-stat.overall.cpi
1740 +27.3% 2216 ± 3% perf-stat.overall.cycles-between-cache-misses
0.35 +5.5% 0.37 perf-stat.overall.ipc
9.73 -0.4 9.34 perf-stat.overall.node-load-miss-rate%
6.39 +0.3 6.72 perf-stat.overall.node-store-miss-rate%
7.914e+09 ± 2% +4.6% 8.276e+09 perf-stat.ps.branch-instructions
1.509e+08 +4.7% 1.579e+08 perf-stat.ps.branch-misses
66615187 ± 2% -22.1% 51881477 ± 3% perf-stat.ps.cache-misses
679734 +7.2% 729007 ± 2% perf-stat.ps.context-switches
9.369e+09 ± 2% +3.7% 9.712e+09 perf-stat.ps.dTLB-stores
33673038 ± 2% +7.2% 36098564 ± 2% perf-stat.ps.iTLB-load-misses
4.09e+10 ± 2% +4.6% 4.276e+10 perf-stat.ps.instructions
0.09 ± 23% -44.4% 0.05 ± 26% perf-stat.ps.major-faults
8328473 ± 2% -11.0% 7410272 ± 2% perf-stat.ps.node-load-misses
77301667 ± 3% -6.9% 71997671 ± 3% perf-stat.ps.node-loads
2344250 ± 4% -29.7% 1647553 ± 3% perf-stat.ps.node-store-misses
34315831 ± 3% -33.4% 22865994 ± 3% perf-stat.ps.node-stores
***************************************************************************************************
lkp-skl-d08: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
filesystem/gcc-12/performance/1SSD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/file-ioctl/stress-ng/60s
commit:
57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark")
6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing")
57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a
---------------- ---------------------------
%stddev %change %stddev
\ | \
127.00 ± 10% +36.1% 172.83 ± 15% perf-c2c.HITM.local
0.00 ± 72% +130.4% 0.01 ± 30% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_extent_state.__clear_extent_bit.btrfs_clone_files
14.83 ± 19% +33.7% 19.83 ± 10% sched_debug.cpu.nr_uninterruptible.max
339939 -6.6% 317593 stress-ng.file-ioctl.ops
5665 -6.6% 5293 stress-ng.file-ioctl.ops_per_sec
6444 ± 4% -25.2% 4820 ± 5% stress-ng.time.involuntary_context_switches
89198237 -6.5% 83411572 proc-vmstat.numa_hit
89117176 -6.8% 83056324 proc-vmstat.numa_local
92833230 -6.6% 86743293 proc-vmstat.pgalloc_normal
92791999 -6.6% 86700599 proc-vmstat.pgfree
0.25 ± 56% +110.2% 0.53 ± 12% perf-stat.i.major-faults
127575 ± 27% +138.3% 303957 ± 3% perf-stat.i.node-stores
0.25 ± 56% +110.2% 0.52 ± 12% perf-stat.ps.major-faults
125751 ± 27% +138.3% 299653 ± 3% perf-stat.ps.node-stores
1.199e+12 -2.1% 1.174e+12 perf-stat.total.instructions
15.80 -0.7 15.14 perf-profile.calltrace.cycles-pp.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
15.46 -0.6 14.84 perf-profile.calltrace.cycles-pp.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
9.84 -0.5 9.32 perf-profile.calltrace.cycles-pp.memcmp.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range
11.95 -0.4 11.52 perf-profile.calltrace.cycles-pp.btrfs_do_readpage.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare
8.72 ± 2% -0.4 8.28 perf-profile.calltrace.cycles-pp.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
5.56 ± 2% -0.4 5.18 perf-profile.calltrace.cycles-pp.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
0.64 ± 10% -0.3 0.36 ± 71% perf-profile.calltrace.cycles-pp.find_free_extent.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate
2.57 ± 5% -0.3 2.29 ± 2% perf-profile.calltrace.cycles-pp.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
2.44 ± 6% -0.3 2.17 ± 2% perf-profile.calltrace.cycles-pp.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64
2.53 ± 5% -0.3 2.26 ± 2% perf-profile.calltrace.cycles-pp.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.66 ± 9% -0.2 0.46 ± 45% perf-profile.calltrace.cycles-pp.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate.vfs_fallocate
1.42 ± 3% -0.1 1.31 ± 4% perf-profile.calltrace.cycles-pp.clear_state_bit.__clear_extent_bit.btrfs_invalidate_folio.truncate_cleanup_folio.truncate_inode_pages_range
0.70 ± 4% -0.1 0.62 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare
0.69 ± 4% -0.1 0.63 ± 4% perf-profile.calltrace.cycles-pp.btrfs_punch_hole.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl
29.90 +0.6 30.49 perf-profile.calltrace.cycles-pp.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range
0.00 +0.9 0.86 ± 6% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
68.10 +1.2 69.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
68.47 +1.2 69.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
67.35 +1.2 68.59 perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
21.54 ± 3% +1.5 23.02 perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
21.51 ± 3% +1.5 23.00 perf-profile.calltrace.cycles-pp.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
21.46 ± 3% +1.5 22.94 perf-profile.calltrace.cycles-pp.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl
21.53 ± 3% +1.5 23.01 perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
0.00 +1.5 1.49 ± 3% perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone
21.15 ± 3% +1.5 22.66 perf-profile.calltrace.cycles-pp.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone
64.61 +1.5 66.16 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
2.66 ± 2% +1.8 4.51 ± 3% perf-profile.calltrace.cycles-pp.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep
0.97 ± 3% +1.8 2.82 ± 5% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio
2.02 ± 3% +1.9 3.90 ± 4% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep
1.27 ± 2% +1.9 3.17 ± 4% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare
0.35 ± 70% +2.0 2.31 ± 5% perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc
0.00 +2.0 2.00 ± 4% perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
1.72 ± 2% +2.1 3.78 perf-profile.calltrace.cycles-pp.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range
0.00 +2.1 2.09 ± 2% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files
0.00 +2.1 2.12 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range
0.00 +2.1 2.14 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range
15.81 -0.7 15.15 perf-profile.children.cycles-pp.filemap_read_folio
15.47 -0.6 14.86 perf-profile.children.cycles-pp.btrfs_read_folio
9.89 -0.5 9.38 perf-profile.children.cycles-pp.memcmp
11.98 -0.4 11.54 perf-profile.children.cycles-pp.btrfs_do_readpage
8.74 ± 2% -0.4 8.30 perf-profile.children.cycles-pp.filemap_add_folio
9.73 ± 3% -0.4 9.35 perf-profile.children.cycles-pp.__clear_extent_bit
5.66 ± 2% -0.4 5.30 perf-profile.children.cycles-pp.__filemap_add_folio
2.45 ± 6% -0.3 2.17 ± 2% perf-profile.children.cycles-pp.btrfs_fallocate
2.57 ± 5% -0.3 2.29 ± 2% perf-profile.children.cycles-pp.ioctl_preallocate
2.53 ± 5% -0.3 2.26 ± 2% perf-profile.children.cycles-pp.vfs_fallocate
4.67 ± 2% -0.3 4.41 ± 3% perf-profile.children.cycles-pp.__set_extent_bit
4.83 ± 2% -0.3 4.58 ± 3% perf-profile.children.cycles-pp.lock_extent
5.06 ± 2% -0.2 4.82 ± 2% perf-profile.children.cycles-pp.alloc_extent_state
4.11 ± 2% -0.2 3.94 ± 2% perf-profile.children.cycles-pp.kmem_cache_alloc
1.37 ± 4% -0.1 1.25 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.66 ± 9% -0.1 0.54 ± 6% perf-profile.children.cycles-pp.btrfs_reserve_extent
0.64 ± 10% -0.1 0.53 ± 6% perf-profile.children.cycles-pp.find_free_extent
0.96 ± 4% -0.1 0.87 ± 6% perf-profile.children.cycles-pp.__wake_up
0.62 ± 4% -0.1 0.54 ± 6% perf-profile.children.cycles-pp.__cond_resched
1.20 ± 4% -0.1 1.12 ± 3% perf-profile.children.cycles-pp.free_extent_state
0.99 ± 3% -0.1 0.92 ± 4% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.89 ± 3% -0.1 0.81 ± 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.69 ± 4% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.btrfs_punch_hole
0.12 ± 10% -0.0 0.09 ± 10% perf-profile.children.cycles-pp.__fget_light
0.02 ±141% +0.0 0.06 ± 13% perf-profile.children.cycles-pp.calc_available_free_space
0.29 ± 8% +0.1 0.39 ± 6% perf-profile.children.cycles-pp.__mod_zone_page_state
0.09 ± 17% +0.2 0.25 ± 6% perf-profile.children.cycles-pp.__kmalloc_node
0.09 ± 15% +0.2 0.25 ± 4% perf-profile.children.cycles-pp.kvmalloc_node
0.08 ± 11% +0.2 0.24 ± 4% perf-profile.children.cycles-pp.__kmalloc_large_node
0.24 ± 13% +0.2 0.41 ± 4% perf-profile.children.cycles-pp.__list_add_valid_or_report
0.32 ± 15% +0.6 0.91 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
30.03 +0.6 30.64 perf-profile.children.cycles-pp.do_read_cache_folio
1.10 ± 4% +0.6 1.72 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.58 ± 6% +0.9 1.50 ± 5% perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
67.36 +1.2 68.60 perf-profile.children.cycles-pp.__x64_sys_ioctl
21.52 ± 3% +1.5 23.00 perf-profile.children.cycles-pp.do_clone_file_range
21.54 ± 3% +1.5 23.02 perf-profile.children.cycles-pp.ioctl_file_clone
21.53 ± 3% +1.5 23.01 perf-profile.children.cycles-pp.vfs_clone_file_range
21.16 ± 3% +1.5 22.66 perf-profile.children.cycles-pp.btrfs_clone_files
0.00 +1.5 1.52 ± 3% perf-profile.children.cycles-pp.__free_one_page
64.61 +1.5 66.16 perf-profile.children.cycles-pp.do_vfs_ioctl
64.16 +1.5 65.71 perf-profile.children.cycles-pp.btrfs_remap_file_range
2.68 ± 3% +1.8 4.52 ± 3% perf-profile.children.cycles-pp.folio_alloc
0.54 ± 6% +2.0 2.51 ± 5% perf-profile.children.cycles-pp.__rmqueue_pcplist
1.03 ± 3% +2.0 3.04 ± 5% perf-profile.children.cycles-pp.rmqueue
2.16 ± 3% +2.0 4.19 ± 4% perf-profile.children.cycles-pp.__alloc_pages
1.32 ± 2% +2.1 3.42 ± 4% perf-profile.children.cycles-pp.get_page_from_freelist
0.00 +2.1 2.10 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk
2.66 ± 2% +2.1 4.77 perf-profile.children.cycles-pp.btrfs_clone
0.03 ±100% +2.1 2.17 ± 2% perf-profile.children.cycles-pp.free_unref_page
0.40 ± 6% +2.2 2.55 ± 2% perf-profile.children.cycles-pp.free_unref_page_commit
0.00 +2.2 2.21 ± 4% perf-profile.children.cycles-pp.rmqueue_bulk
9.82 -0.5 9.32 perf-profile.self.cycles-pp.memcmp
0.84 ± 5% -0.1 0.76 ± 6% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.13 ± 4% -0.1 1.05 ± 2% perf-profile.self.cycles-pp.free_extent_state
0.99 ± 3% -0.1 0.92 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.22 ± 8% -0.1 0.16 ± 13% perf-profile.self.cycles-pp.find_free_extent
0.38 ± 4% -0.1 0.32 ± 8% perf-profile.self.cycles-pp.__cond_resched
0.12 ± 10% -0.0 0.08 ± 11% perf-profile.self.cycles-pp.__fget_light
0.06 ± 7% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.__x64_sys_ioctl
0.07 ± 15% +0.0 0.10 ± 9% perf-profile.self.cycles-pp.folio_alloc
0.28 ± 10% +0.1 0.36 ± 7% perf-profile.self.cycles-pp.get_page_from_freelist
0.26 ± 8% +0.1 0.36 ± 4% perf-profile.self.cycles-pp.__mod_zone_page_state
0.22 ± 14% +0.2 0.38 ± 5% perf-profile.self.cycles-pp.__list_add_valid_or_report
0.00 +0.2 0.24 ± 6% perf-profile.self.cycles-pp.free_pcppages_bulk
0.32 ± 15% +0.6 0.91 ± 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.00 +0.6 0.62 ± 10% perf-profile.self.cycles-pp.rmqueue_bulk
0.55 ± 6% +0.9 1.46 ± 5% perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
0.00 +1.3 1.32 ± 4% perf-profile.self.cycles-pp.__free_one_page
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2023-11-23 5:03 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-23 5:03 kernel test robot [this message]
2023-11-23 5:40 ` [linus:master] [mm, pcp] 6ccdcb6d3a: stress-ng.judy.ops_per_sec -4.7% regression Huang, Ying
2023-11-24 6:53 ` Oliver Sang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202311231029.3aa790-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=cl@linux.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=jweiner@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=oe-lkp@lists.linux.dev \
--cc=pasha.tatashin@soleen.com \
--cc=sudeep.holla@arm.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.