All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-next:master] [mm]  7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression
@ 2026-06-18  8:00 kernel test robot
  2026-06-18  9:30 ` Jan Kara
  0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2026-06-18  8:00 UTC (permalink / raw)
  To: Frederick Mayle
  Cc: oe-lkp, lkp, Andrew Morton, Jan Kara, Kalesh Singh,
	David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox,
	Suren Baghdasaryan, linux-fsdevel, linux-mm, oliver.sang



Hello,

kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on:


commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

[still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81]

testcase: pts
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
parameters:

	test: svt-av1-2.11.1
	option_a: 1
	option_b: Bosphorus 4K
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts

commit: 
  0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors")
  7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries")

0b20c36c118d2122 7b32f64bc512b40b268776c5ac4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    169.95 ±  6%     -45.8%      92.15 ± 10%  pts.svt-av1.Preset13.Bosphorus4K.frames_per_second
    220.57 ±  3%    +870.9%       2141        pts.time.major_page_faults
   5898121            -3.7%    5682645        pts.time.maximum_resident_set_size
   1408522            -1.3%    1390537        pts.time.minor_page_faults
    370.43 ±  3%     -12.1%     325.57 ±  6%  pts.time.percent_of_cpu_this_job_got
    645228 ±  2%      +7.6%     694407 ±  3%  pts.time.voluntary_context_switches
    162542 ±  7%     +22.4%     198946 ±  8%  sched_debug.cpu.avg_idle.stddev
    274573 ±  3%      -9.3%     249101 ±  3%  vmstat.io.bi
 6.779e+09 ±  3%     +11.3%  7.545e+09 ±  4%  cpuidle..time
   7639156 ±  3%     +10.5%    8439210 ±  3%  cpuidle..usage
      3.73 ± 39%      -2.2        1.52 ± 52%  perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
      0.65 ± 87%      +1.8        2.47 ± 60%  perf-profile.children.cycles-pp.link_path_walk
      0.24 ± 46%     -80.9%       0.05 ±143%  perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.65 ± 18%     -33.7%       0.43 ± 17%  perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.02 ±  7%      +0.0        0.06 ±  8%  mpstat.cpu.all.iowait%
      1.08 ±  4%      -0.2        0.93 ±  6%  mpstat.cpu.all.usr%
     22.00 ±  2%     +14.3%      25.14 ±  2%  mpstat.max_utilization.seconds
     10.00 ±  3%     -40.5%       5.94 ±  2%  mpstat.max_utilization_pct
   5916248            +6.9%    6324576        meminfo.Active
   3659476 ±  2%     +11.4%    4078190 ±  2%  meminfo.Active(anon)
    760816 ±  3%     +16.9%     889730 ±  3%  meminfo.AnonHugePages
   2263661 ±  4%     +18.5%    2682180 ±  3%  meminfo.AnonPages
   6611024           +11.9%    7400589 ±  2%  meminfo.Committed_AS
  11386404            +5.1%   11964152        meminfo.Memused
     21554            +5.6%      22770 ±  2%  meminfo.PageTables
    916081 ±  2%     +11.4%    1020691 ±  2%  proc-vmstat.nr_active_anon
    567129 ±  3%     +18.4%     671694 ±  3%  proc-vmstat.nr_anon_pages
    370.88 ±  3%     +17.2%     434.57 ±  3%  proc-vmstat.nr_anon_transparent_hugepages
     48477            +1.6%      49272        proc-vmstat.nr_kernel_stack
      5399            +5.4%       5691 ±  2%  proc-vmstat.nr_page_table_pages
    916081 ±  2%     +11.4%    1020691 ±  2%  proc-vmstat.nr_zone_active_anon
   4614742            -1.5%    4547741        proc-vmstat.pgalloc_normal
   1667517            -1.0%    1650517        proc-vmstat.pgfault
   2150014            -2.9%    2087812        proc-vmstat.pgfree
    224.57 ±  3%    +862.1%       2160        proc-vmstat.pgmajfault
      1021            -7.2%     948.14        proc-vmstat.thp_fault_alloc
 2.679e+09 ±  3%      -7.5%  2.477e+09 ±  2%  perf-stat.i.branch-instructions
      0.80            +0.1        0.86        perf-stat.i.branch-miss-rate%
  27722463 ±  3%      -8.0%   25498433 ±  2%  perf-stat.i.branch-misses
  47056892 ±  6%     -11.2%   41804370 ±  6%  perf-stat.i.cache-misses
    397.43 ±  3%      -7.4%     368.14 ±  2%  perf-stat.i.cpu-migrations
      0.03 ±  4%      +0.0        0.03 ±  4%  perf-stat.i.dTLB-load-miss-rate%
   3146082 ±  5%     -13.9%    2707716 ±  5%  perf-stat.i.dTLB-load-misses
   1739613 ±  5%     -12.5%    1521663 ±  4%  perf-stat.i.dTLB-store-misses
      9.97 ±  5%    +712.5%      81.01 ±  3%  perf-stat.i.major-faults
     40.27 ±  4%      -8.2%      36.99 ±  3%  perf-stat.i.metric.M/sec
     58347 ±  4%     -11.5%      51627 ±  3%  perf-stat.i.minor-faults
     63.09 ±  6%      -5.8       57.29 ±  8%  perf-stat.i.node-load-miss-rate%
   6474191 ±  5%     -14.1%    5560267 ±  5%  perf-stat.i.node-loads
     58356 ±  4%     -11.4%      51708 ±  3%  perf-stat.i.page-faults
      0.98            -0.0        0.96        perf-stat.overall.branch-miss-rate%
    435.13 ±  3%     +10.0%     478.43 ±  4%  perf-stat.overall.cycles-between-cache-misses
      0.06 ±  2%      -0.0        0.06 ±  4%  perf-stat.overall.dTLB-load-miss-rate%
      0.08 ±  2%      -0.0        0.07 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
 2.646e+09 ±  3%      -8.7%  2.415e+09 ±  2%  perf-stat.ps.branch-instructions
  25808992 ±  3%     -10.2%   23188954 ±  2%  perf-stat.ps.branch-misses
  45057464 ±  5%     -14.4%   38574053 ±  6%  perf-stat.ps.cache-misses
 1.428e+08 ±  3%      -9.7%  1.289e+08 ±  2%  perf-stat.ps.cache-references
    367.11 ±  2%      -7.4%     339.78 ±  2%  perf-stat.ps.cpu-migrations
   2972584 ±  5%     -17.3%    2458144 ±  5%  perf-stat.ps.dTLB-load-misses
 4.801e+09 ±  3%      -9.2%  4.359e+09 ±  2%  perf-stat.ps.dTLB-loads
   1715051 ±  5%     -14.7%    1462728 ±  4%  perf-stat.ps.dTLB-store-misses
 2.193e+09 ±  3%      -9.2%  1.992e+09 ±  2%  perf-stat.ps.dTLB-stores
  2.06e+10 ±  3%      -9.2%  1.871e+10 ±  2%  perf-stat.ps.instructions
      8.54 ±  4%    +768.3%      74.19 ±  2%  perf-stat.ps.major-faults
     59802 ±  3%     -11.0%      53218 ±  3%  perf-stat.ps.minor-faults
   6145046 ±  4%     -17.9%    5046537 ±  4%  perf-stat.ps.node-loads
     59810 ±  3%     -10.9%      53292 ±  3%  perf-stat.ps.page-faults




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [linux-next:master] [mm]  7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression
  2026-06-18  8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot
@ 2026-06-18  9:30 ` Jan Kara
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2026-06-18  9:30 UTC (permalink / raw)
  To: kernel test robot
  Cc: Frederick Mayle, oe-lkp, lkp, Andrew Morton, Jan Kara,
	Kalesh Singh, David Hildenbrand, Lorenzo Stoakes, Matthew Wilcox,
	Suren Baghdasaryan, linux-fsdevel, linux-mm

On Thu 18-06-26 16:00:42, kernel test robot wrote:
> Hello,
> 
> kernel test robot noticed a 45.8% regression of pts.svt-av1.Preset13.Bosphorus4K.frames_per_second on:

This one looks serious enough and real. It would be good to figure out what
happens in this benchmark that it benefits from the readahead across VMA
boundaries so much...

								Honza


> commit: 7b32f64bc512b40b268776c5ac4d354b325b3197 ("mm: limit filemap_fault readahead to VMA boundaries")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> [still regression on linux-next/master ec039126b7fac4e3af35ebccaa7c6f9b6875ba81]
> 
> testcase: pts
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
> parameters:
> 
> 	test: svt-av1-2.11.1
> 	option_a: 1
> 	option_b: Bosphorus 4K
> 	cpufreq_governor: performance
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202606181547.617a6967-lkp@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20260618/202606181547.617a6967-lkp@intel.com
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
>   gcc-14/performance/x86_64-rhel-9.4/1/Bosphorus 4K/debian-12-x86_64-phoronix/lkp-gnr-2sp3/svt-av1-2.11.1/pts
> 
> commit: 
>   0b20c36c11 ("mm/madvise: reject invalid process_madvise() advice for zero-length vectors")
>   7b32f64bc5 ("mm: limit filemap_fault readahead to VMA boundaries")
> 
> 0b20c36c118d2122 7b32f64bc512b40b268776c5ac4 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>     169.95 ±  6%     -45.8%      92.15 ± 10%  pts.svt-av1.Preset13.Bosphorus4K.frames_per_second
>     220.57 ±  3%    +870.9%       2141        pts.time.major_page_faults
>    5898121            -3.7%    5682645        pts.time.maximum_resident_set_size
>    1408522            -1.3%    1390537        pts.time.minor_page_faults
>     370.43 ±  3%     -12.1%     325.57 ±  6%  pts.time.percent_of_cpu_this_job_got
>     645228 ±  2%      +7.6%     694407 ±  3%  pts.time.voluntary_context_switches
>     162542 ±  7%     +22.4%     198946 ±  8%  sched_debug.cpu.avg_idle.stddev
>     274573 ±  3%      -9.3%     249101 ±  3%  vmstat.io.bi
>  6.779e+09 ±  3%     +11.3%  7.545e+09 ±  4%  cpuidle..time
>    7639156 ±  3%     +10.5%    8439210 ±  3%  cpuidle..usage
>       3.73 ± 39%      -2.2        1.52 ± 52%  perf-profile.calltrace.cycles-pp.pv_native_safe_halt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
>       0.65 ± 87%      +1.8        2.47 ± 60%  perf-profile.children.cycles-pp.link_path_walk
>       0.24 ± 46%     -80.9%       0.05 ±143%  perf-sched.sch_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.65 ± 18%     -33.7%       0.43 ± 17%  perf-sched.wait_and_delay.avg.ms.perf_trace_sched_switch.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.02 ±  7%      +0.0        0.06 ±  8%  mpstat.cpu.all.iowait%
>       1.08 ±  4%      -0.2        0.93 ±  6%  mpstat.cpu.all.usr%
>      22.00 ±  2%     +14.3%      25.14 ±  2%  mpstat.max_utilization.seconds
>      10.00 ±  3%     -40.5%       5.94 ±  2%  mpstat.max_utilization_pct
>    5916248            +6.9%    6324576        meminfo.Active
>    3659476 ±  2%     +11.4%    4078190 ±  2%  meminfo.Active(anon)
>     760816 ±  3%     +16.9%     889730 ±  3%  meminfo.AnonHugePages
>    2263661 ±  4%     +18.5%    2682180 ±  3%  meminfo.AnonPages
>    6611024           +11.9%    7400589 ±  2%  meminfo.Committed_AS
>   11386404            +5.1%   11964152        meminfo.Memused
>      21554            +5.6%      22770 ±  2%  meminfo.PageTables
>     916081 ±  2%     +11.4%    1020691 ±  2%  proc-vmstat.nr_active_anon
>     567129 ±  3%     +18.4%     671694 ±  3%  proc-vmstat.nr_anon_pages
>     370.88 ±  3%     +17.2%     434.57 ±  3%  proc-vmstat.nr_anon_transparent_hugepages
>      48477            +1.6%      49272        proc-vmstat.nr_kernel_stack
>       5399            +5.4%       5691 ±  2%  proc-vmstat.nr_page_table_pages
>     916081 ±  2%     +11.4%    1020691 ±  2%  proc-vmstat.nr_zone_active_anon
>    4614742            -1.5%    4547741        proc-vmstat.pgalloc_normal
>    1667517            -1.0%    1650517        proc-vmstat.pgfault
>    2150014            -2.9%    2087812        proc-vmstat.pgfree
>     224.57 ±  3%    +862.1%       2160        proc-vmstat.pgmajfault
>       1021            -7.2%     948.14        proc-vmstat.thp_fault_alloc
>  2.679e+09 ±  3%      -7.5%  2.477e+09 ±  2%  perf-stat.i.branch-instructions
>       0.80            +0.1        0.86        perf-stat.i.branch-miss-rate%
>   27722463 ±  3%      -8.0%   25498433 ±  2%  perf-stat.i.branch-misses
>   47056892 ±  6%     -11.2%   41804370 ±  6%  perf-stat.i.cache-misses
>     397.43 ±  3%      -7.4%     368.14 ±  2%  perf-stat.i.cpu-migrations
>       0.03 ±  4%      +0.0        0.03 ±  4%  perf-stat.i.dTLB-load-miss-rate%
>    3146082 ±  5%     -13.9%    2707716 ±  5%  perf-stat.i.dTLB-load-misses
>    1739613 ±  5%     -12.5%    1521663 ±  4%  perf-stat.i.dTLB-store-misses
>       9.97 ±  5%    +712.5%      81.01 ±  3%  perf-stat.i.major-faults
>      40.27 ±  4%      -8.2%      36.99 ±  3%  perf-stat.i.metric.M/sec
>      58347 ±  4%     -11.5%      51627 ±  3%  perf-stat.i.minor-faults
>      63.09 ±  6%      -5.8       57.29 ±  8%  perf-stat.i.node-load-miss-rate%
>    6474191 ±  5%     -14.1%    5560267 ±  5%  perf-stat.i.node-loads
>      58356 ±  4%     -11.4%      51708 ±  3%  perf-stat.i.page-faults
>       0.98            -0.0        0.96        perf-stat.overall.branch-miss-rate%
>     435.13 ±  3%     +10.0%     478.43 ±  4%  perf-stat.overall.cycles-between-cache-misses
>       0.06 ±  2%      -0.0        0.06 ±  4%  perf-stat.overall.dTLB-load-miss-rate%
>       0.08 ±  2%      -0.0        0.07 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
>  2.646e+09 ±  3%      -8.7%  2.415e+09 ±  2%  perf-stat.ps.branch-instructions
>   25808992 ±  3%     -10.2%   23188954 ±  2%  perf-stat.ps.branch-misses
>   45057464 ±  5%     -14.4%   38574053 ±  6%  perf-stat.ps.cache-misses
>  1.428e+08 ±  3%      -9.7%  1.289e+08 ±  2%  perf-stat.ps.cache-references
>     367.11 ±  2%      -7.4%     339.78 ±  2%  perf-stat.ps.cpu-migrations
>    2972584 ±  5%     -17.3%    2458144 ±  5%  perf-stat.ps.dTLB-load-misses
>  4.801e+09 ±  3%      -9.2%  4.359e+09 ±  2%  perf-stat.ps.dTLB-loads
>    1715051 ±  5%     -14.7%    1462728 ±  4%  perf-stat.ps.dTLB-store-misses
>  2.193e+09 ±  3%      -9.2%  1.992e+09 ±  2%  perf-stat.ps.dTLB-stores
>   2.06e+10 ±  3%      -9.2%  1.871e+10 ±  2%  perf-stat.ps.instructions
>       8.54 ±  4%    +768.3%      74.19 ±  2%  perf-stat.ps.major-faults
>      59802 ±  3%     -11.0%      53218 ±  3%  perf-stat.ps.minor-faults
>    6145046 ±  4%     -17.9%    5046537 ±  4%  perf-stat.ps.node-loads
>      59810 ±  3%     -10.9%      53292 ±  3%  perf-stat.ps.page-faults
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-18  9:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18  8:00 [linux-next:master] [mm] 7b32f64bc5: pts.svt-av1.Preset13.Bosphorus4K.frames_per_second 45.8% regression kernel test robot
2026-06-18  9:30 ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.