* [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression
@ 2026-06-18 8:00 kernel test robot
2026-06-18 9:30 ` Jan Kara
0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2026-06-18 8:00 UTC (permalink / raw)
To: Frederick Mayle
Cc: oe-lkp, lkp, Andrew Morton, Jan Kara, Kalesh Singh,
David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox,
Suren Baghdasaryan, linux-fsdevel, linux-mm, oliver.sang
Hello,
kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on:
commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81]
testcase: pts
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P CPU @ 2.4GHz (Granite Rapids) with 256G memory
parameters:
test: svt-av1-2.11.1
option_a: 1
option_b: Bosphorus 4K
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts
commit:
0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors")
7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries")
0b20c36c118d2122 7b32f64bc512b40b268776c5ac4
---------------- ---------------------------
%stddev %change %stddev
\ | \
169.95 ± 6% -45.8% 92.15 ± 10% pts.svt-av1.Preset13.Bosphorus4K.frames_per_second
220.57 ± 3% +870.9% 2141 pts.time.major_page_faults
5898121 -3.7% 5682645 pts.time.maximum_resident_set_size
1408522 -1.3% 1390537 pts.time.minor_page_faults
370.43 ± 3% -12.1% 325.57 ± 6% pts.time.percent_of_cpu_this_job_got
645228 ± 2% +7.6% 694407 ± 3% pts.time.voluntary_context_switches
162542 ± 7% +22.4% 198946 ± 8% sched_debug.cpu.avg_idle.stddev
274573 ± 3% -9.3% 249101 ± 3% vmstat.io.bi
6.779e+09 ± 3% +11.3% 7.545e+09 ± 4% cpuidle..time
7639156 ± 3% +10.5% 8439210 ± 3% cpuidle..usage
3.73 ± 39% -2.2 1.52 ± 52% perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
0.65 ± 87% +1.8 2.47 ± 60% perf-profile.children.cycles-pp.link_path_walk
0.24 ± 46% -80.9% 0.05 ±143% perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.65 ± 18% -33.7% 0.43 ± 17% perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.02 ± 7% +0.0 0.06 ± 8% mpstat.cpu.all.iowait%
1.08 ± 4% -0.2 0.93 ± 6% mpstat.cpu.all.usr%
22.00 ± 2% +14.3% 25.14 ± 2% mpstat.max_utilization.seconds
10.00 ± 3% -40.5% 5.94 ± 2% mpstat.max_utilization_pct
5916248 +6.9% 6324576 meminfo.Active
3659476 ± 2% +11.4% 4078190 ± 2% meminfo.Active(anon)
760816 ± 3% +16.9% 889730 ± 3% meminfo.AnonHugePages
2263661 ± 4% +18.5% 2682180 ± 3% meminfo.AnonPages
6611024 +11.9% 7400589 ± 2% meminfo.Committed_AS
11386404 +5.1% 11964152 meminfo.Memused
21554 +5.6% 22770 ± 2% meminfo.PageTables
916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_active_anon
567129 ± 3% +18.4% 671694 ± 3% proc-vmstat.nr_anon_pages
370.88 ± 3% +17.2% 434.57 ± 3% proc-vmstat.nr_anon_transparent_hugepages
48477 +1.6% 49272 proc-vmstat.nr_kernel_stack
5399 +5.4% 5691 ± 2% proc-vmstat.nr_page_table_pages
916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_zone_active_anon
4614742 -1.5% 4547741 proc-vmstat.pgalloc_normal
1667517 -1.0% 1650517 proc-vmstat.pgfault
2150014 -2.9% 2087812 proc-vmstat.pgfree
224.57 ± 3% +862.1% 2160 proc-vmstat.pgmajfault
1021 -7.2% 948.14 proc-vmstat.thp_fault_alloc
2.679e+09 ± 3% -7.5% 2.477e+09 ± 2% perf-stat.i.branch-instructions
0.80 +0.1 0.86 perf-stat.i.branch-miss-rate%
27722463 ± 3% -8.0% 25498433 ± 2% perf-stat.i.branch-misses
47056892 ± 6% -11.2% 41804370 ± 6% perf-stat.i.cache-misses
397.43 ± 3% -7.4% 368.14 ± 2% perf-stat.i.cpu-migrations
0.03 ± 4% +0.0 0.03 ± 4% perf-stat.i.dTLB-load-miss-rate%
3146082 ± 5% -13.9% 2707716 ± 5% perf-stat.i.dTLB-load-misses
1739613 ± 5% -12.5% 1521663 ± 4% perf-stat.i.dTLB-store-misses
9.97 ± 5% +712.5% 81.01 ± 3% perf-stat.i.major-faults
40.27 ± 4% -8.2% 36.99 ± 3% perf-stat.i.metric.M/sec
58347 ± 4% -11.5% 51627 ± 3% perf-stat.i.minor-faults
63.09 ± 6% -5.8 57.29 ± 8% perf-stat.i.node-load-miss-rate%
6474191 ± 5% -14.1% 5560267 ± 5% perf-stat.i.node-loads
58356 ± 4% -11.4% 51708 ± 3% perf-stat.i.page-faults
0.98 -0.0 0.96 perf-stat.overall.branch-miss-rate%
435.13 ± 3% +10.0% 478.43 ± 4% perf-stat.overall.cycles-between-cache-misses
0.06 ± 2% -0.0 0.06 ± 4% perf-stat.overall.dTLB-load-miss-rate%
0.08 ± 2% -0.0 0.07 ± 2% perf-stat.overall.dTLB-store-miss-rate%
2.646e+09 ± 3% -8.7% 2.415e+09 ± 2% perf-stat.ps.branch-instructions
25808992 ± 3% -10.2% 23188954 ± 2% perf-stat.ps.branch-misses
45057464 ± 5% -14.4% 38574053 ± 6% perf-stat.ps.cache-misses
1.428e+08 ± 3% -9.7% 1.289e+08 ± 2% perf-stat.ps.cache-references
367.11 ± 2% -7.4% 339.78 ± 2% perf-stat.ps.cpu-migrations
2972584 ± 5% -17.3% 2458144 ± 5% perf-stat.ps.dTLB-load-misses
4.801e+09 ± 3% -9.2% 4.359e+09 ± 2% perf-stat.ps.dTLB-loads
1715051 ± 5% -14.7% 1462728 ± 4% perf-stat.ps.dTLB-store-misses
2.193e+09 ± 3% -9.2% 1.992e+09 ± 2% perf-stat.ps.dTLB-stores
2.06e+10 ± 3% -9.2% 1.871e+10 ± 2% perf-stat.ps.instructions
8.54 ± 4% +768.3% 74.19 ± 2% perf-stat.ps.major-faults
59802 ± 3% -11.0% 53218 ± 3% perf-stat.ps.minor-faults
6145046 ± 4% -17.9% 5046537 ± 4% perf-stat.ps.node-loads
59810 ± 3% -10.9% 53292 ± 3% perf-stat.ps.page-faults
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression
2026-06-18 8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot
@ 2026-06-18 9:30 ` Jan Kara
0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2026-06-18 9:30 UTC (permalink / raw)
To: kernel test robot
Cc: Frederick Mayle, oe-lkp, lkp, Andrew Morton, Jan Kara,
Kalesh Singh, David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox,
Suren Baghdasaryan, linux-fsdevel, linux-mm
On Thu 18-06-26 16:00:42, kernel test robot wrote:
> Hello,
>
> kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on:
This one looks serious enough and real. It would be good to figure out what
happens in this benchmark that it benefits from the readahead across VMA
boundaries so much...
Honza
> commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> [still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81]
>
> testcase: pts
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P CPU @ 2.4GHz (Granite Rapids) with 256G memory
> parameters:
>
> test: svt-av1-2.11.1
> option_a: 1
> option_b: Bosphorus 4K
> cpufreq_governor: performance
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
> gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts
>
> commit:
> 0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors")
> 7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries")
>
> 0b20c36c118d2122 7b32f64bc512b40b268776c5ac4
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 169.95 ± 6% -45.8% 92.15 ± 10% pts.svt-av1.Preset13.Bosphorus4K.frames_per_second
> 220.57 ± 3% +870.9% 2141 pts.time.major_page_faults
> 5898121 -3.7% 5682645 pts.time.maximum_resident_set_size
> 1408522 -1.3% 1390537 pts.time.minor_page_faults
> 370.43 ± 3% -12.1% 325.57 ± 6% pts.time.percent_of_cpu_this_job_got
> 645228 ± 2% +7.6% 694407 ± 3% pts.time.voluntary_context_switches
> 162542 ± 7% +22.4% 198946 ± 8% sched_debug.cpu.avg_idle.stddev
> 274573 ± 3% -9.3% 249101 ± 3% vmstat.io.bi
> 6.779e+09 ± 3% +11.3% 7.545e+09 ± 4% cpuidle..time
> 7639156 ± 3% +10.5% 8439210 ± 3% cpuidle..usage
> 3.73 ± 39% -2.2 1.52 ± 52% perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
> 0.65 ± 87% +1.8 2.47 ± 60% perf-profile.children.cycles-pp.link_path_walk
> 0.24 ± 46% -80.9% 0.05 ±143% perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.65 ± 18% -33.7% 0.43 ± 17% perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.02 ± 7% +0.0 0.06 ± 8% mpstat.cpu.all.iowait%
> 1.08 ± 4% -0.2 0.93 ± 6% mpstat.cpu.all.usr%
> 22.00 ± 2% +14.3% 25.14 ± 2% mpstat.max_utilization.seconds
> 10.00 ± 3% -40.5% 5.94 ± 2% mpstat.max_utilization_pct
> 5916248 +6.9% 6324576 meminfo.Active
> 3659476 ± 2% +11.4% 4078190 ± 2% meminfo.Active(anon)
> 760816 ± 3% +16.9% 889730 ± 3% meminfo.AnonHugePages
> 2263661 ± 4% +18.5% 2682180 ± 3% meminfo.AnonPages
> 6611024 +11.9% 7400589 ± 2% meminfo.Committed_AS
> 11386404 +5.1% 11964152 meminfo.Memused
> 21554 +5.6% 22770 ± 2% meminfo.PageTables
> 916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_active_anon
> 567129 ± 3% +18.4% 671694 ± 3% proc-vmstat.nr_anon_pages
> 370.88 ± 3% +17.2% 434.57 ± 3% proc-vmstat.nr_anon_transparent_hugepages
> 48477 +1.6% 49272 proc-vmstat.nr_kernel_stack
> 5399 +5.4% 5691 ± 2% proc-vmstat.nr_page_table_pages
> 916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_zone_active_anon
> 4614742 -1.5% 4547741 proc-vmstat.pgalloc_normal
> 1667517 -1.0% 1650517 proc-vmstat.pgfault
> 2150014 -2.9% 2087812 proc-vmstat.pgfree
> 224.57 ± 3% +862.1% 2160 proc-vmstat.pgmajfault
> 1021 -7.2% 948.14 proc-vmstat.thp_fault_alloc
> 2.679e+09 ± 3% -7.5% 2.477e+09 ± 2% perf-stat.i.branch-instructions
> 0.80 +0.1 0.86 perf-stat.i.branch-miss-rate%
> 27722463 ± 3% -8.0% 25498433 ± 2% perf-stat.i.branch-misses
> 47056892 ± 6% -11.2% 41804370 ± 6% perf-stat.i.cache-misses
> 397.43 ± 3% -7.4% 368.14 ± 2% perf-stat.i.cpu-migrations
> 0.03 ± 4% +0.0 0.03 ± 4% perf-stat.i.dTLB-load-miss-rate%
> 3146082 ± 5% -13.9% 2707716 ± 5% perf-stat.i.dTLB-load-misses
> 1739613 ± 5% -12.5% 1521663 ± 4% perf-stat.i.dTLB-store-misses
> 9.97 ± 5% +712.5% 81.01 ± 3% perf-stat.i.major-faults
> 40.27 ± 4% -8.2% 36.99 ± 3% perf-stat.i.metric.M/sec
> 58347 ± 4% -11.5% 51627 ± 3% perf-stat.i.minor-faults
> 63.09 ± 6% -5.8 57.29 ± 8% perf-stat.i.node-load-miss-rate%
> 6474191 ± 5% -14.1% 5560267 ± 5% perf-stat.i.node-loads
> 58356 ± 4% -11.4% 51708 ± 3% perf-stat.i.page-faults
> 0.98 -0.0 0.96 perf-stat.overall.branch-miss-rate%
> 435.13 ± 3% +10.0% 478.43 ± 4% perf-stat.overall.cycles-between-cache-misses
> 0.06 ± 2% -0.0 0.06 ± 4% perf-stat.overall.dTLB-load-miss-rate%
> 0.08 ± 2% -0.0 0.07 ± 2% perf-stat.overall.dTLB-store-miss-rate%
> 2.646e+09 ± 3% -8.7% 2.415e+09 ± 2% perf-stat.ps.branch-instructions
> 25808992 ± 3% -10.2% 23188954 ± 2% perf-stat.ps.branch-misses
> 45057464 ± 5% -14.4% 38574053 ± 6% perf-stat.ps.cache-misses
> 1.428e+08 ± 3% -9.7% 1.289e+08 ± 2% perf-stat.ps.cache-references
> 367.11 ± 2% -7.4% 339.78 ± 2% perf-stat.ps.cpu-migrations
> 2972584 ± 5% -17.3% 2458144 ± 5% perf-stat.ps.dTLB-load-misses
> 4.801e+09 ± 3% -9.2% 4.359e+09 ± 2% perf-stat.ps.dTLB-loads
> 1715051 ± 5% -14.7% 1462728 ± 4% perf-stat.ps.dTLB-store-misses
> 2.193e+09 ± 3% -9.2% 1.992e+09 ± 2% perf-stat.ps.dTLB-stores
> 2.06e+10 ± 3% -9.2% 1.871e+10 ± 2% perf-stat.ps.instructions
> 8.54 ± 4% +768.3% 74.19 ± 2% perf-stat.ps.major-faults
> 59802 ± 3% -11.0% 53218 ± 3% perf-stat.ps.minor-faults
> 6145046 ± 4% -17.9% 5046537 ± 4% perf-stat.ps.node-loads
> 59810 ± 3% -10.9% 53292 ± 3% perf-stat.ps.page-faults
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-18 9:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18 8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot
2026-06-18 9:30 ` Jan Kara
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.