* [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression
@ 2026-06-18 8:00 kernel test robot
2026-06-18 9:30 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: kernel test robot @ 2026-06-18 8:00 UTC (permalink / raw)
To: Frederick Mayle
Cc: oe-lkp, lkp, Andrew Morton, Jan Kara, Kalesh Singh,
David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox,
Suren Baghdasaryan, linux-fsdevel, linux-mm, oliver.sang
Hello,
kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on:
commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81]
testcase: pts
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P CPU @ 2.4GHz (Granite Rapids) with 256G memory
parameters:
test: svt-av1-2.11.1
option_a: 1
option_b: Bosphorus 4K
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts
commit:
0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors")
7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries")
0b20c36c118d2122 7b32f64bc512b40b268776c5ac4
---------------- ---------------------------
%stddev %change %stddev
\ | \
169.95 ± 6% -45.8% 92.15 ± 10% pts.svt-av1.Preset13.Bosphorus4K.frames_per_second
220.57 ± 3% +870.9% 2141 pts.time.major_page_faults
5898121 -3.7% 5682645 pts.time.maximum_resident_set_size
1408522 -1.3% 1390537 pts.time.minor_page_faults
370.43 ± 3% -12.1% 325.57 ± 6% pts.time.percent_of_cpu_this_job_got
645228 ± 2% +7.6% 694407 ± 3% pts.time.voluntary_context_switches
162542 ± 7% +22.4% 198946 ± 8% sched_debug.cpu.avg_idle.stddev
274573 ± 3% -9.3% 249101 ± 3% vmstat.io.bi
6.779e+09 ± 3% +11.3% 7.545e+09 ± 4% cpuidle..time
7639156 ± 3% +10.5% 8439210 ± 3% cpuidle..usage
3.73 ± 39% -2.2 1.52 ± 52% perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
0.65 ± 87% +1.8 2.47 ± 60% perf-profile.children.cycles-pp.link_path_walk
0.24 ± 46% -80.9% 0.05 ±143% perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.65 ± 18% -33.7% 0.43 ± 17% perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.02 ± 7% +0.0 0.06 ± 8% mpstat.cpu.all.iowait%
1.08 ± 4% -0.2 0.93 ± 6% mpstat.cpu.all.usr%
22.00 ± 2% +14.3% 25.14 ± 2% mpstat.max_utilization.seconds
10.00 ± 3% -40.5% 5.94 ± 2% mpstat.max_utilization_pct
5916248 +6.9% 6324576 meminfo.Active
3659476 ± 2% +11.4% 4078190 ± 2% meminfo.Active(anon)
760816 ± 3% +16.9% 889730 ± 3% meminfo.AnonHugePages
2263661 ± 4% +18.5% 2682180 ± 3% meminfo.AnonPages
6611024 +11.9% 7400589 ± 2% meminfo.Committed_AS
11386404 +5.1% 11964152 meminfo.Memused
21554 +5.6% 22770 ± 2% meminfo.PageTables
916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_active_anon
567129 ± 3% +18.4% 671694 ± 3% proc-vmstat.nr_anon_pages
370.88 ± 3% +17.2% 434.57 ± 3% proc-vmstat.nr_anon_transparent_hugepages
48477 +1.6% 49272 proc-vmstat.nr_kernel_stack
5399 +5.4% 5691 ± 2% proc-vmstat.nr_page_table_pages
916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_zone_active_anon
4614742 -1.5% 4547741 proc-vmstat.pgalloc_normal
1667517 -1.0% 1650517 proc-vmstat.pgfault
2150014 -2.9% 2087812 proc-vmstat.pgfree
224.57 ± 3% +862.1% 2160 proc-vmstat.pgmajfault
1021 -7.2% 948.14 proc-vmstat.thp_fault_alloc
2.679e+09 ± 3% -7.5% 2.477e+09 ± 2% perf-stat.i.branch-instructions
0.80 +0.1 0.86 perf-stat.i.branch-miss-rate%
27722463 ± 3% -8.0% 25498433 ± 2% perf-stat.i.branch-misses
47056892 ± 6% -11.2% 41804370 ± 6% perf-stat.i.cache-misses
397.43 ± 3% -7.4% 368.14 ± 2% perf-stat.i.cpu-migrations
0.03 ± 4% +0.0 0.03 ± 4% perf-stat.i.dTLB-load-miss-rate%
3146082 ± 5% -13.9% 2707716 ± 5% perf-stat.i.dTLB-load-misses
1739613 ± 5% -12.5% 1521663 ± 4% perf-stat.i.dTLB-store-misses
9.97 ± 5% +712.5% 81.01 ± 3% perf-stat.i.major-faults
40.27 ± 4% -8.2% 36.99 ± 3% perf-stat.i.metric.M/sec
58347 ± 4% -11.5% 51627 ± 3% perf-stat.i.minor-faults
63.09 ± 6% -5.8 57.29 ± 8% perf-stat.i.node-load-miss-rate%
6474191 ± 5% -14.1% 5560267 ± 5% perf-stat.i.node-loads
58356 ± 4% -11.4% 51708 ± 3% perf-stat.i.page-faults
0.98 -0.0 0.96 perf-stat.overall.branch-miss-rate%
435.13 ± 3% +10.0% 478.43 ± 4% perf-stat.overall.cycles-between-cache-misses
0.06 ± 2% -0.0 0.06 ± 4% perf-stat.overall.dTLB-load-miss-rate%
0.08 ± 2% -0.0 0.07 ± 2% perf-stat.overall.dTLB-store-miss-rate%
2.646e+09 ± 3% -8.7% 2.415e+09 ± 2% perf-stat.ps.branch-instructions
25808992 ± 3% -10.2% 23188954 ± 2% perf-stat.ps.branch-misses
45057464 ± 5% -14.4% 38574053 ± 6% perf-stat.ps.cache-misses
1.428e+08 ± 3% -9.7% 1.289e+08 ± 2% perf-stat.ps.cache-references
367.11 ± 2% -7.4% 339.78 ± 2% perf-stat.ps.cpu-migrations
2972584 ± 5% -17.3% 2458144 ± 5% perf-stat.ps.dTLB-load-misses
4.801e+09 ± 3% -9.2% 4.359e+09 ± 2% perf-stat.ps.dTLB-loads
1715051 ± 5% -14.7% 1462728 ± 4% perf-stat.ps.dTLB-store-misses
2.193e+09 ± 3% -9.2% 1.992e+09 ± 2% perf-stat.ps.dTLB-stores
2.06e+10 ± 3% -9.2% 1.871e+10 ± 2% perf-stat.ps.instructions
8.54 ± 4% +768.3% 74.19 ± 2% perf-stat.ps.major-faults
59802 ± 3% -11.0% 53218 ± 3% perf-stat.ps.minor-faults
6145046 ± 4% -17.9% 5046537 ± 4% perf-stat.ps.node-loads
59810 ± 3% -10.9% 53292 ± 3% perf-stat.ps.page-faults
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression 2026-06-18 8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot @ 2026-06-18 9:30 ` Jan Kara 2026-06-18 14:30 ` Lorenzo Stoakes 0 siblings, 1 reply; 6+ messages in thread From: Jan Kara @ 2026-06-18 9:30 UTC (permalink / raw) To: kernel test robot Cc: Frederick Mayle, oe-lkp, lkp, Andrew Morton, Jan Kara, Kalesh Singh, David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox, Suren Baghdasaryan, linux-fsdevel, linux-mm On Thu 18-06-26 16:00:42, kernel test robot wrote: > Hello, > > kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on: This one looks serious enough and real. It would be good to figure out what happens in this benchmark that it benefits from the readahead across VMA boundaries so much... Honza > commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > [still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81] > > testcase: pts > config: x86_64-rhel-9.4 > compiler: gcc-14 > test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P CPU @ 2.4GHz (Granite Rapids) with 256G memory > parameters: > > test: svt-av1-2.11.1 > option_a: 1 > option_b: Bosphorus 4K > cpufreq_governor: performance > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com > > ========================================================================================= > compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase: > gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts > > commit: > 0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors") > 7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries") > > 0b20c36c118d2122 7b32f64bc512b40b268776c5ac4 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 169.95 ± 6% -45.8% 92.15 ± 10% pts.svt-av1.Preset13.Bosphorus4K.frames_per_second > 220.57 ± 3% +870.9% 2141 pts.time.major_page_faults > 5898121 -3.7% 5682645 pts.time.maximum_resident_set_size > 1408522 -1.3% 1390537 pts.time.minor_page_faults > 370.43 ± 3% -12.1% 325.57 ± 6% pts.time.percent_of_cpu_this_job_got > 645228 ± 2% +7.6% 694407 ± 3% pts.time.voluntary_context_switches > 162542 ± 7% +22.4% 198946 ± 8% sched_debug.cpu.avg_idle.stddev > 274573 ± 3% -9.3% 249101 ± 3% vmstat.io.bi > 6.779e+09 ± 3% +11.3% 7.545e+09 ± 4% cpuidle..time > 7639156 ± 3% +10.5% 8439210 ± 3% cpuidle..usage > 3.73 ± 39% -2.2 1.52 ± 52% perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state > 0.65 ± 87% +1.8 2.47 ± 60% perf-profile.children.cycles-pp.link_path_walk > 0.24 ± 46% -80.9% 0.05 ±143% perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.65 ± 18% -33.7% 0.43 ± 17% perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.02 ± 7% +0.0 0.06 ± 8% mpstat.cpu.all.iowait% > 1.08 ± 4% -0.2 0.93 ± 6% mpstat.cpu.all.usr% > 22.00 ± 2% +14.3% 25.14 ± 2% mpstat.max_utilization.seconds > 10.00 ± 3% -40.5% 5.94 ± 2% mpstat.max_utilization_pct > 5916248 +6.9% 6324576 meminfo.Active > 3659476 ± 2% +11.4% 4078190 ± 2% meminfo.Active(anon) > 760816 ± 3% +16.9% 889730 ± 3% meminfo.AnonHugePages > 2263661 ± 4% +18.5% 2682180 ± 3% meminfo.AnonPages > 6611024 +11.9% 7400589 ± 2% meminfo.Committed_AS > 11386404 +5.1% 11964152 meminfo.Memused > 21554 +5.6% 22770 ± 2% meminfo.PageTables > 916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_active_anon > 567129 ± 3% +18.4% 671694 ± 3% proc-vmstat.nr_anon_pages > 370.88 ± 3% +17.2% 434.57 ± 3% proc-vmstat.nr_anon_transparent_hugepages > 48477 +1.6% 49272 proc-vmstat.nr_kernel_stack > 5399 +5.4% 5691 ± 2% proc-vmstat.nr_page_table_pages > 916081 ± 2% +11.4% 1020691 ± 2% proc-vmstat.nr_zone_active_anon > 4614742 -1.5% 4547741 proc-vmstat.pgalloc_normal > 1667517 -1.0% 1650517 proc-vmstat.pgfault > 2150014 -2.9% 2087812 proc-vmstat.pgfree > 224.57 ± 3% +862.1% 2160 proc-vmstat.pgmajfault > 1021 -7.2% 948.14 proc-vmstat.thp_fault_alloc > 2.679e+09 ± 3% -7.5% 2.477e+09 ± 2% perf-stat.i.branch-instructions > 0.80 +0.1 0.86 perf-stat.i.branch-miss-rate% > 27722463 ± 3% -8.0% 25498433 ± 2% perf-stat.i.branch-misses > 47056892 ± 6% -11.2% 41804370 ± 6% perf-stat.i.cache-misses > 397.43 ± 3% -7.4% 368.14 ± 2% perf-stat.i.cpu-migrations > 0.03 ± 4% +0.0 0.03 ± 4% perf-stat.i.dTLB-load-miss-rate% > 3146082 ± 5% -13.9% 2707716 ± 5% perf-stat.i.dTLB-load-misses > 1739613 ± 5% -12.5% 1521663 ± 4% perf-stat.i.dTLB-store-misses > 9.97 ± 5% +712.5% 81.01 ± 3% perf-stat.i.major-faults > 40.27 ± 4% -8.2% 36.99 ± 3% perf-stat.i.metric.M/sec > 58347 ± 4% -11.5% 51627 ± 3% perf-stat.i.minor-faults > 63.09 ± 6% -5.8 57.29 ± 8% perf-stat.i.node-load-miss-rate% > 6474191 ± 5% -14.1% 5560267 ± 5% perf-stat.i.node-loads > 58356 ± 4% -11.4% 51708 ± 3% perf-stat.i.page-faults > 0.98 -0.0 0.96 perf-stat.overall.branch-miss-rate% > 435.13 ± 3% +10.0% 478.43 ± 4% perf-stat.overall.cycles-between-cache-misses > 0.06 ± 2% -0.0 0.06 ± 4% perf-stat.overall.dTLB-load-miss-rate% > 0.08 ± 2% -0.0 0.07 ± 2% perf-stat.overall.dTLB-store-miss-rate% > 2.646e+09 ± 3% -8.7% 2.415e+09 ± 2% perf-stat.ps.branch-instructions > 25808992 ± 3% -10.2% 23188954 ± 2% perf-stat.ps.branch-misses > 45057464 ± 5% -14.4% 38574053 ± 6% perf-stat.ps.cache-misses > 1.428e+08 ± 3% -9.7% 1.289e+08 ± 2% perf-stat.ps.cache-references > 367.11 ± 2% -7.4% 339.78 ± 2% perf-stat.ps.cpu-migrations > 2972584 ± 5% -17.3% 2458144 ± 5% perf-stat.ps.dTLB-load-misses > 4.801e+09 ± 3% -9.2% 4.359e+09 ± 2% perf-stat.ps.dTLB-loads > 1715051 ± 5% -14.7% 1462728 ± 4% perf-stat.ps.dTLB-store-misses > 2.193e+09 ± 3% -9.2% 1.992e+09 ± 2% perf-stat.ps.dTLB-stores > 2.06e+10 ± 3% -9.2% 1.871e+10 ± 2% perf-stat.ps.instructions > 8.54 ± 4% +768.3% 74.19 ± 2% perf-stat.ps.major-faults > 59802 ± 3% -11.0% 53218 ± 3% perf-stat.ps.minor-faults > 6145046 ± 4% -17.9% 5046537 ± 4% perf-stat.ps.node-loads > 59810 ± 3% -10.9% 53292 ± 3% perf-stat.ps.page-faults > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > -- > 0-DAY CI Kernel Test Service > https://github.com/intel/lkp-tests/wiki > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression 2026-06-18 9:30 ` Jan Kara @ 2026-06-18 14:30 ` Lorenzo Stoakes 2026-06-18 16:03 ` Suren Baghdasaryan 0 siblings, 1 reply; 6+ messages in thread From: Lorenzo Stoakes @ 2026-06-18 14:30 UTC (permalink / raw) To: Jan Kara Cc: kernel test robot, Frederick Mayle, oe-lkp, lkp, Andrew Morton, Kalesh Singh, David Hildenbrand, Matthew Wilcox, Suren Baghdasaryan, linux-fsdevel, linux-mm On Thu, Jun 18, 2026 at 11:30:47AM +0200, Jan Kara wrote: > On Thu 18-06-26 16:00:42, kernel test robot wrote: > > Hello, > > > > kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on: > > This one looks serious enough and real. It would be good to figure out what > happens in this benchmark that it benefits from the readahead across VMA > boundaries so much... I think a revert first no? This seems pretty huge for something that isn't key to the kernel, then a new attempt can be tried with this issue addressed perhaps? I seem to recall objecting to this in the past, had a look around and saw [0] which has some discussion I think specifically on the VMA boundaries thing. [0]:https://lore.kernel.org/linux-mm/CAC_TJvfG8GcwG_2w1o6GOTZS8tfEx2h9A91qsenYfYsX8Te=Bg@mail.gmail.com/ (Sorry to be naggy buuut :) - I see there wasn't maintainer [Matthew] signoff on this, we should really be in the habit of _requiring_ that for merge). Thanks, Lorenzo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression 2026-06-18 14:30 ` Lorenzo Stoakes @ 2026-06-18 16:03 ` Suren Baghdasaryan 2026-06-18 16:32 ` Pedro Falcato 0 siblings, 1 reply; 6+ messages in thread From: Suren Baghdasaryan @ 2026-06-18 16:03 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Jan Kara, kernel test robot, Frederick Mayle, oe-lkp, lkp, Andrew Morton, Kalesh Singh, David Hildenbrand, Matthew Wilcox, linux-fsdevel, linux-mm On Thu, Jun 18, 2026 at 2:30 PM Lorenzo Stoakes <ljs@kernel.org> wrote: > > On Thu, Jun 18, 2026 at 11:30:47AM +0200, Jan Kara wrote: > > On Thu 18-06-26 16:00:42, kernel test robot wrote: > > > Hello, > > > > > > kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on: > > > > This one looks serious enough and real. It would be good to figure out what > > happens in this benchmark that it benefits from the readahead across VMA > > boundaries so much... > > I think a revert first no? This seems pretty huge for something that isn't key > to the kernel, then a new attempt can be tried with this issue addressed > perhaps? A quick search yields: "The pts.svt-av1.Preset13.Bosphorus4K.frames_per_second is a benchmarking metric from the Phoronix Test Suite that measures how many frames per second a CPU can encode using the open-source SVT-AV1 video encoder." If this is a video encoding benchmark I would expect it to explicitly prefetch the data from the disk before measuring the encoding speed. If limiting readahead caused this regression, I suspect the benchmark doesn't explicitly prefetch the data... > > I seem to recall objecting to this in the past, had a look around and saw [0] > which has some discussion I think specifically on the VMA boundaries thing. > > [0]:https://lore.kernel.org/linux-mm/CAC_TJvfG8GcwG_2w1o6GOTZS8tfEx2h9A91qsenYfYsX8Te=Bg@mail.gmail.com/ > > (Sorry to be naggy buuut :) - I see there wasn't maintainer [Matthew] signoff on > this, we should really be in the habit of _requiring_ that for merge). > > Thanks, Lorenzo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression 2026-06-18 16:03 ` Suren Baghdasaryan @ 2026-06-18 16:32 ` Pedro Falcato 2026-06-18 16:51 ` Suren Baghdasaryan 0 siblings, 1 reply; 6+ messages in thread From: Pedro Falcato @ 2026-06-18 16:32 UTC (permalink / raw) To: Suren Baghdasaryan Cc: Lorenzo Stoakes, Jan Kara, kernel test robot, Frederick Mayle, oe-lkp, lkp, Andrew Morton, Kalesh Singh, David Hildenbrand, Matthew Wilcox, linux-fsdevel, linux-mm On Thu, Jun 18, 2026 at 04:03:43PM +0000, Suren Baghdasaryan wrote: > On Thu, Jun 18, 2026 at 2:30 PM Lorenzo Stoakes <ljs@kernel.org> wrote: > > > > On Thu, Jun 18, 2026 at 11:30:47AM +0200, Jan Kara wrote: > > > On Thu 18-06-26 16:00:42, kernel test robot wrote: > > > > Hello, > > > > > > > > kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on: > > > > > > This one looks serious enough and real. It would be good to figure out what > > > happens in this benchmark that it benefits from the readahead across VMA > > > boundaries so much... > > > > I think a revert first no? This seems pretty huge for something that isn't key > > to the kernel, then a new attempt can be tried with this issue addressed > > perhaps? > > A quick search yields: "The > pts.svt-av1.Preset13.Bosphorus4K.frames_per_second is a benchmarking > metric from the Phoronix Test Suite that measures how many frames per > second a CPU can encode using the open-source SVT-AV1 video encoder." > > If this is a video encoding benchmark I would expect it to explicitly > prefetch the data from the disk before measuring the encoding speed. > If limiting readahead caused this regression, I suspect the benchmark > doesn't explicitly prefetch the data... Well, commonly video data doesn't actually fit in memory :) A quick look at the code (I think it's https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Source/App/app_process_cmd.c#L821) suggests it is progressively mapping the file data for a given frame (or frames?). So the old behavior would result in page faults for a given frame starting readahead for the next few frames. This looks reasonable. FWIW I suspected this was a really weird case regarding mprotect or something, and I'm happy it isn't; but at least I had a suggestion for that - for this, maybe dropping the change (for now?) is the best course of action. -- Pedro ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression 2026-06-18 16:32 ` Pedro Falcato @ 2026-06-18 16:51 ` Suren Baghdasaryan 0 siblings, 0 replies; 6+ messages in thread From: Suren Baghdasaryan @ 2026-06-18 16:51 UTC (permalink / raw) To: Pedro Falcato Cc: Lorenzo Stoakes, Jan Kara, kernel test robot, Frederick Mayle, oe-lkp, lkp, Andrew Morton, Kalesh Singh, David Hildenbrand, Matthew Wilcox, linux-fsdevel, linux-mm On Thu, Jun 18, 2026 at 9:32 AM Pedro Falcato <pfalcato@suse.de> wrote: > > On Thu, Jun 18, 2026 at 04:03:43PM +0000, Suren Baghdasaryan wrote: > > On Thu, Jun 18, 2026 at 2:30 PM Lorenzo Stoakes <ljs@kernel.org> wrote: > > > > > > On Thu, Jun 18, 2026 at 11:30:47AM +0200, Jan Kara wrote: > > > > On Thu 18-06-26 16:00:42, kernel test robot wrote: > > > > > Hello, > > > > > > > > > > kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on: > > > > > > > > This one looks serious enough and real. It would be good to figure out what > > > > happens in this benchmark that it benefits from the readahead across VMA > > > > boundaries so much... > > > > > > I think a revert first no? This seems pretty huge for something that isn't key > > > to the kernel, then a new attempt can be tried with this issue addressed > > > perhaps? > > > > A quick search yields: "The > > pts.svt-av1.Preset13.Bosphorus4K.frames_per_second is a benchmarking > > metric from the Phoronix Test Suite that measures how many frames per > > second a CPU can encode using the open-source SVT-AV1 video encoder." > > > > If this is a video encoding benchmark I would expect it to explicitly > > prefetch the data from the disk before measuring the encoding speed. > > If limiting readahead caused this regression, I suspect the benchmark > > doesn't explicitly prefetch the data... > > Well, commonly video data doesn't actually fit in memory :) > > A quick look at the code (I think it's https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Source/App/app_process_cmd.c#L821) > suggests it is progressively mapping the file data for a given frame > (or frames?). So the old behavior would result in page faults for a given > frame starting readahead for the next few frames. This looks reasonable. Yeah, looks like it's mapping one frame at a time and even that is done with 3 mmap calls - it maps luma_read_size and then 2 chroma_read_size chunks which are consequitive and could have been mapped with one syscall. Seems very inefficient but I guess it's legit. > > FWIW I suspected this was a really weird case regarding mprotect or > something, and I'm happy it isn't; but at least I had a suggestion for that - > for this, maybe dropping the change (for now?) is the best course of action. > > -- > Pedro ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-18 16:52 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-18 8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot 2026-06-18 9:30 ` Jan Kara 2026-06-18 14:30 ` Lorenzo Stoakes 2026-06-18 16:03 ` Suren Baghdasaryan 2026-06-18 16:32 ` Pedro Falcato 2026-06-18 16:51 ` Suren Baghdasaryan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.