All of lore.kernel.org
 help / color / mirror / Atom feed
* [tip:sched/core] [sched/fair]  e837456fdc:  pts.quadray.1.1080p.fps 23.2% improvement
@ 2025-11-21  6:59 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-11-21  6:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: oe-lkp, lkp, linux-kernel, x86, Peter Zijlstra, aubrey.li,
	yu.c.chen, oliver.sang



Hello,

kernel test robot noticed a 23.2% improvement of pts.quadray.1.1080p.fps on:


commit: e837456fdca81899a3c8e47b3fd39e30eae6e291 ("sched/fair: Reimplement NEXT_BUDDY to align with EEVDF goals")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core


testcase: pts
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
parameters:

	need_x: true
	test: quadray-1.0.0
	option_a: 5
	option_b: 1080p
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251121/202511211403.39a71f1e-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/true/5/1080p/debian-12-x86_64-phoronix/lkp-csl-2sp7/quadray-1.0.0/pts

commit: 
  aceccac58a ("sched/fair: Enable scheduler feature NEXT_BUDDY")
  e837456fdc ("sched/fair: Reimplement NEXT_BUDDY to align with EEVDF goals")

aceccac58ad76305 e837456fdca81899a3c8e47b3fd 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  15179434 ± 18%     +51.8%   23046485 ± 15%  meminfo.DirectMap2M
      0.37            +0.0        0.41        mpstat.cpu.all.sys%
   1055135 ± 31%     -53.4%     492047 ± 91%  numa-meminfo.node0.Shmem
    263798 ± 31%     -53.4%     122995 ± 91%  numa-vmstat.node0.nr_shmem
     21104            +9.2%      23039        vmstat.system.cs
    371194            +1.8%     378021        proc-vmstat.nr_shmem
      4628            +2.9%       4762        proc-vmstat.numa_huge_pte_updates
     18161 ±  6%     +22.2%      22195 ± 11%  sched_debug.cfs_rq:/system.slice.avg_vruntime.stddev
     18051 ±  6%     +22.4%      22090 ± 11%  sched_debug.cfs_rq:/system.slice.zero_vruntime.stddev
     16231           +11.2%      18053        sched_debug.cpu.nr_switches.avg
      6.26 ± 10%     -19.7%       5.03 ± 10%  sched_debug.cpu.nr_uninterruptible.stddev
     39.69 ±  2%     +23.2%      48.89        pts.quadray.1.1080p.fps
      3245            +2.4%       3323        pts.time.percent_of_cpu_this_job_got
     39.97 ±  2%     +14.1%      45.62        pts.time.system_time
      4456            +2.2%       4555        pts.time.user_time
   1290701            +9.9%    1419060        pts.time.voluntary_context_switches
  6.13e+09            +2.8%  6.299e+09        perf-stat.i.branch-instructions
     21517            +9.0%      23449        perf-stat.i.context-switches
 6.949e+10            +2.3%  7.108e+10        perf-stat.i.cpu-cycles
    203.07            +7.8%     218.95        perf-stat.i.cpu-migrations
      8980 ±  2%      -7.8%       8279        perf-stat.i.cycles-between-cache-misses
 2.515e+10            +2.9%  2.589e+10        perf-stat.i.dTLB-loads
  8.21e+09            +2.7%  8.429e+09        perf-stat.i.dTLB-stores
 6.904e+10            +2.9%  7.102e+10        perf-stat.i.instructions
      0.72            +2.3%       0.74        perf-stat.i.metric.GHz
    514.91 ±  2%      -9.4%     466.42 ±  5%  perf-stat.i.metric.K/sec
    412.78            +2.9%     424.58        perf-stat.i.metric.M/sec
    372305 ±  2%     +20.8%     449603 ±  5%  perf-stat.i.node-store-misses
      1.81            -0.0        1.77        perf-stat.overall.branch-miss-rate%
     57.23 ±  2%      +4.3       61.51 ±  3%  perf-stat.overall.node-store-miss-rate%
 6.085e+09            +2.8%  6.254e+09        perf-stat.ps.branch-instructions
     21348            +9.0%      23273        perf-stat.ps.context-switches
   6.9e+10            +2.3%  7.059e+10        perf-stat.ps.cpu-cycles
    201.61            +7.8%     217.38        perf-stat.ps.cpu-migrations
 2.498e+10            +2.9%  2.571e+10        perf-stat.ps.dTLB-loads
 8.154e+09            +2.7%  8.373e+09        perf-stat.ps.dTLB-stores
 6.855e+10            +2.9%  7.053e+10        perf-stat.ps.instructions
    369608 ±  2%     +20.8%     446351 ±  5%  perf-stat.ps.node-store-misses
 9.519e+12            +3.0%  9.803e+12        perf-stat.total.instructions
      0.00 ±223%   +1490.9%       0.03 ±109%  perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ±  9%     -94.7%       0.00 ±111%  perf-sched.sch_delay.avg.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
      0.06 ± 21%     -86.5%       0.01 ± 17%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
      0.12 ± 31%     -89.2%       0.01 ± 68%  perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.06 ± 67%     -67.6%       0.02 ± 40%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.03 ±  2%     -36.3%       0.02 ±  3%  perf-sched.sch_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      0.21 ± 12%     -87.8%       0.03 ±  7%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.05 ± 14%     -78.5%       0.01 ± 14%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.do_epoll_pwait.part
      0.03 ± 43%     -70.0%       0.01 ±  9%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      0.22 ± 56%     -94.6%       0.01 ± 41%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
      0.02 ± 27%     -54.8%       0.01 ±  6%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.00 ±223%   +3636.4%       0.07 ±169%  perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1.11 ± 92%     -98.3%       0.02 ± 11%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      2.06 ±  6%     -98.4%       0.03 ± 15%  perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
      3.22 ±  4%     -63.6%       1.17 ± 98%  perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.67 ±129%    +808.5%       6.12 ± 52%  perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.00 ±223%   +4066.7%       0.04 ±153%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
      0.02 ± 32%    +715.6%       0.12 ±149%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
      2.22 ± 14%     -43.5%       1.25 ± 30%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.do_epoll_pwait.part
      1.84 ± 48%     -91.5%       0.16 ±169%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      1.68 ± 55%     -94.8%       0.09 ±164%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
      0.35 ±130%     -93.5%       0.02 ± 25%  perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      1.34 ± 35%     -97.2%       0.04 ± 48%  perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.03 ±  3%     -37.6%       0.02 ±  3%  perf-sched.total_sch_delay.average.ms
      8.33 ±  4%     -18.9%       6.76 ±  3%  perf-sched.total_wait_and_delay.average.ms
    162856           +20.3%     195924        perf-sched.total_wait_and_delay.count.ms
      8.30 ±  4%     -18.8%       6.74 ±  3%  perf-sched.total_wait_time.average.ms
    213.84 ± 51%     -87.7%      26.25 ± 77%  perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     15.84 ±  2%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
    800.53           -15.3%     677.82 ± 13%  perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      2.37 ±  2%     -21.9%       1.85        perf-sched.wait_and_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      1.38 ±  9%     -51.8%       0.67 ± 42%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     38.24 ±  7%     -12.2%      33.58 ±  7%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     61.60 ±  3%     -24.2%      46.70 ±  2%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
     13.33 ± 24%    +217.5%      42.33 ±  9%  perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.33 ±141%   +5300.0%      18.00 ± 12%  perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
    196.50 ±  2%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.ww_mutex_lock.drm_gem_vunmap.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
    152066 ±  2%     +19.5%     181763        perf-sched.wait_and_delay.count.futex_do_wait.__futex_wait.futex_wait.do_futex
      1290 ±  6%     +27.7%       1647 ± 12%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    272.83           +16.9%     319.00 ±  2%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
    238.33 ±  4%     +24.8%     297.33        perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      2356 ± 47%     -66.5%     788.64 ± 92%  perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     44.95 ±147%   +2131.0%       1002 ± 69%  perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
     59.34 ± 41%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
      0.01 ±146%  +30486.1%       1.84 ± 31%  perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
    213.84 ± 51%     -87.7%      26.25 ± 77%  perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     15.82 ±  2%     -96.0%       0.63 ± 66%  perf-sched.wait_time.avg.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
    800.27           -15.3%     677.81 ± 13%  perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      0.61 ±  5%     +15.0%       0.70 ±  7%  perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.34 ±  2%     -21.8%       1.83        perf-sched.wait_time.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      1.05 ±  2%    +199.9%       3.15        perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.11 ± 17%   +1066.9%       1.23 ± 12%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
      0.47 ± 38%    +148.3%       1.17 ± 30%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
     38.23 ±  7%     -12.2%      33.58 ±  7%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     61.57 ±  3%     -24.2%      46.69 ±  2%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      0.00 ± 49%   +5108.3%       0.10 ± 73%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      0.01 ±142%  +28030.6%       2.30 ± 23%  perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
      2356 ± 47%     -66.5%     788.64 ± 92%  perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     45.07 ±146%   +2125.3%       1002 ± 69%  perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
     11.69 ±  6%     -17.3%       9.67 ±  9%  perf-sched.wait_time.max.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
    237.73 ±143%    +561.4%       1572 ± 61%  perf-sched.wait_time.max.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      3.69 ±  4%   +1614.4%      63.33 ± 48%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
      3.90 ± 26%    +959.8%      41.38 ±100%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      0.04 ± 50%   +5947.9%       2.19 ± 52%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-11-21  7:00 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-21  6:59 [tip:sched/core] [sched/fair] e837456fdc: pts.quadray.1.1080p.fps 23.2% improvement kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.