All of lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [sched/deadline]  cccb45d7c4: stress-ng.netdev.ops_per_sec 61.6% regression
@ 2025-08-07  8:34 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-08-07  8:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, Chris Mason, Juri Lelli, aubrey.li,
	yu.c.chen, oliver.sang



Hello,

besides the regressions (and improvements) we reported as
"[tip:sched/core] [sched/deadline]  cccb45d7c4: will-it-scale.per_thread_ops 36.7% regression"
in
https://lore.kernel.org/all/202507230755.5fe8e03e-lkp@intel.com/

now we captured 2 more regressions when this commit is in mainline. just FYI



kernel test robot noticed a 61.6% regression of stress-ng.netdev.ops_per_sec on:


commit: cccb45d7c4295bbfeba616582d0249f2d21e6df5 ("sched/deadline: Less agressive dl_server handling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on      linus/master 7e161a991ea71e6ec526abc8f40c6852ebe3d946]
[still regression on linux-next/master afec768a6a8fe7fb02a08ffce5f2f556f51d4b52]

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: netdev
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -2.2% regression                                            |
| test machine     | 20 threads 1 sockets (Commet Lake) with 16G memory                                           |
| test parameters  | cluster=cs-localhost                                                                         |
|                  | cpufreq_governor=performance                                                                 |
|                  | ip=ipv4                                                                                      |
|                  | nr_threads=200%                                                                              |
|                  | runtime=900s                                                                                 |
|                  | test=TCP_STREAM                                                                              |
+------------------+----------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202508071007.7b2e45c0-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250807/202508071007.7b2e45c0-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp3/netdev/stress-ng/60s

commit: 
  570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
  cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")

570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  3.63e+08          +643.3%  2.698e+09 ± 13%  cpuidle..time
    204743 ±  3%   +1975.4%    4249334 ± 10%  cpuidle..usage
      4.97          +341.0%      21.93 ± 10%  vmstat.cpu.id
    184.63           -22.2%     143.65 ±  4%  vmstat.procs.r
      3473         +3085.4%     110658 ±  8%  vmstat.system.cs
    408964            +2.0%     417266        vmstat.system.in
   1113721           +17.7%    1310862        meminfo.Active
   1113721           +17.7%    1310862        meminfo.Active(anon)
    177366 ±  2%     +11.0%     196822 ±  4%  meminfo.DirectMap4k
    298665           +19.2%     355990 ±  2%  meminfo.Mapped
    420393           +45.3%     610884 ±  2%  meminfo.Shmem
      2.96 ± 20%     +16.7       19.66 ± 14%  mpstat.cpu.all.idle%
      0.26            +0.3        0.61 ±  3%  mpstat.cpu.all.irq%
      0.00 ± 20%      +0.0        0.03 ±  5%  mpstat.cpu.all.soft%
     96.53           -17.1       79.47 ±  3%  mpstat.cpu.all.sys%
     14.83 ± 61%    +238.2%      50.17 ± 20%  mpstat.max_utilization.seconds
    100.00           -14.6%      85.36 ±  3%  mpstat.max_utilization_pct
   9547602           -61.6%    3667923 ±  4%  stress-ng.netdev.ops
    159180           -61.6%      61151 ±  4%  stress-ng.netdev.ops_per_sec
     67355            +2.8%      69256        stress-ng.time.minor_page_faults
     19016           -21.0%      15021 ±  4%  stress-ng.time.percent_of_cpu_this_job_got
     11432           -21.0%       9033 ±  4%  stress-ng.time.system_time
     35368 ±  2%   +9542.0%    3410177 ±  8%  stress-ng.time.voluntary_context_switches
    278515           +17.5%     327222        proc-vmstat.nr_active_anon
    995358            +4.7%    1042520        proc-vmstat.nr_file_pages
     74999           +18.3%      88740 ±  2%  proc-vmstat.nr_mapped
    105146           +44.9%     152305 ±  2%  proc-vmstat.nr_shmem
    278515           +17.5%     327222        proc-vmstat.nr_zone_active_anon
    826913            +7.7%     890858        proc-vmstat.numa_hit
    629070           +10.1%     692863        proc-vmstat.numa_local
    873883            +7.2%     936679        proc-vmstat.pgalloc_normal
    418067            +2.9%     430228        proc-vmstat.pgfault
      0.10 ±  3%     +37.8%       0.14 ±  2%  perf-stat.i.MPKI
 2.248e+10           -22.9%  1.733e+10 ±  4%  perf-stat.i.branch-instructions
      0.10 ±  2%      +0.0        0.15 ±  4%  perf-stat.i.branch-miss-rate%
  18947128           +20.6%   22857416        perf-stat.i.branch-misses
     35.42           -17.1       18.35 ±  9%  perf-stat.i.cache-miss-rate%
   9364646           +11.9%   10482390 ±  2%  perf-stat.i.cache-misses
  27205535          +125.0%   61210467 ± 11%  perf-stat.i.cache-references
      3273 ±  2%   +3392.2%     114320 ±  8%  perf-stat.i.context-switches
      5.35            +3.8%       5.56        perf-stat.i.cpi
 6.028e+11           -20.1%  4.818e+11 ±  4%  perf-stat.i.cpu-cycles
    327.85          +343.1%       1452 ±  9%  perf-stat.i.cpu-migrations
     68905           -27.8%      49741 ±  2%  perf-stat.i.cycles-between-cache-misses
  1.12e+11           -23.0%  8.626e+10 ±  4%  perf-stat.i.instructions
      0.19            -3.5%       0.18        perf-stat.i.ipc
      4316 ±  2%      +6.1%       4578        perf-stat.i.minor-faults
      4316 ±  2%      +6.1%       4578        perf-stat.i.page-faults
      0.08           +45.3%       0.12 ±  2%  perf-stat.overall.MPKI
      0.08            +0.0        0.13 ±  5%  perf-stat.overall.branch-miss-rate%
     34.42           -17.1       17.28 ±  9%  perf-stat.overall.cache-miss-rate%
      5.38            +3.8%       5.59        perf-stat.overall.cpi
     64384           -28.5%      46017 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.19            -3.7%       0.18        perf-stat.overall.ipc
 2.211e+10           -22.9%  1.705e+10 ±  4%  perf-stat.ps.branch-instructions
  18642811           +20.5%   22455956        perf-stat.ps.branch-misses
   9210208           +11.8%   10296398 ±  2%  perf-stat.ps.cache-misses
  26761745          +124.9%   60190009 ± 11%  perf-stat.ps.cache-references
      3220 ±  2%   +3391.4%     112425 ±  8%  perf-stat.ps.context-switches
  5.93e+11           -20.1%  4.739e+11 ±  4%  perf-stat.ps.cpu-cycles
    322.54          +343.0%       1428 ±  9%  perf-stat.ps.cpu-migrations
 1.102e+11           -23.0%  8.484e+10 ±  4%  perf-stat.ps.instructions
      4239 ±  2%      +5.3%       4464        perf-stat.ps.minor-faults
      4239 ±  2%      +5.3%       4464        perf-stat.ps.page-faults
 6.771e+12           -23.7%  5.169e+12 ±  4%  perf-stat.total.instructions
   5992277           -35.8%    3846765 ±  8%  sched_debug.cfs_rq:/.avg_vruntime.avg
   6049811           -19.2%    4888185 ±  5%  sched_debug.cfs_rq:/.avg_vruntime.max
   5847973           -63.4%    2140155 ±  6%  sched_debug.cfs_rq:/.avg_vruntime.min
     30248 ± 13%   +3774.5%    1171963 ±  2%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      0.53           -21.2%       0.42 ±  4%  sched_debug.cfs_rq:/.h_nr_queued.avg
      0.50          -100.0%       0.00        sched_debug.cfs_rq:/.h_nr_queued.min
      0.17 ± 10%     +99.4%       0.34 ±  3%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      0.53           -21.3%       0.42 ±  4%  sched_debug.cfs_rq:/.h_nr_runnable.avg
      0.50          -100.0%       0.00        sched_debug.cfs_rq:/.h_nr_runnable.min
      0.17 ± 10%     +98.9%       0.34 ±  3%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
      2696          -100.0%       0.00        sched_debug.cfs_rq:/.load.min
      2.50           -83.3%       0.42 ±107%  sched_debug.cfs_rq:/.load_avg.min
   5992277           -35.8%    3846765 ±  8%  sched_debug.cfs_rq:/.min_vruntime.avg
   6049811           -19.2%    4888185 ±  5%  sched_debug.cfs_rq:/.min_vruntime.max
   5847973           -63.4%    2140155 ±  6%  sched_debug.cfs_rq:/.min_vruntime.min
     30248 ± 13%   +3774.5%    1171963 ±  2%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.53           -21.2%       0.42 ±  4%  sched_debug.cfs_rq:/.nr_queued.avg
      0.50          -100.0%       0.00        sched_debug.cfs_rq:/.nr_queued.min
      0.12 ±  8%    +185.0%       0.33 ±  4%  sched_debug.cfs_rq:/.nr_queued.stddev
    588.21           -17.7%     484.36 ±  3%  sched_debug.cfs_rq:/.runnable_avg.avg
    489.25 ±  6%     -95.0%      24.25 ±141%  sched_debug.cfs_rq:/.runnable_avg.min
    136.65 ±  9%     +70.5%     233.00 ±  4%  sched_debug.cfs_rq:/.runnable_avg.stddev
    585.65           -17.5%     482.95 ±  3%  sched_debug.cfs_rq:/.util_avg.avg
    410.58 ± 29%     -94.4%      23.00 ±141%  sched_debug.cfs_rq:/.util_avg.min
    117.24 ±  7%     +99.4%     233.84 ±  4%  sched_debug.cfs_rq:/.util_avg.stddev
    520.05           -32.1%     353.31 ±  6%  sched_debug.cfs_rq:/.util_est.avg
      1139 ± 14%     -19.1%     921.17 ± 11%  sched_debug.cfs_rq:/.util_est.max
    387.58 ± 45%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est.min
     67.01 ± 18%    +283.1%     256.74 ±  2%  sched_debug.cfs_rq:/.util_est.stddev
    669274 ± 19%     +60.4%    1073556 ±  6%  sched_debug.cpu.avg_idle.avg
   1885708 ± 21%     +44.0%    2714848 ±  7%  sched_debug.cpu.avg_idle.max
      7213 ± 87%    +597.3%      50301 ±  5%  sched_debug.cpu.avg_idle.min
     16.82 ± 12%     -35.5%      10.86 ±  8%  sched_debug.cpu.clock.stddev
      2573           -21.9%       2010 ±  4%  sched_debug.cpu.curr->pid.avg
      2303 ± 12%    -100.0%       0.00        sched_debug.cpu.curr->pid.min
    448.26 ±  8%    +220.4%       1436 ±  4%  sched_debug.cpu.curr->pid.stddev
    235851 ± 10%     +17.7%     277484 ±  9%  sched_debug.cpu.max_idle_balance_cost.stddev
      0.53           -21.5%       0.42 ±  4%  sched_debug.cpu.nr_running.avg
      0.50          -100.0%       0.00        sched_debug.cpu.nr_running.min
      0.16 ± 11%    +108.2%       0.33 ±  3%  sched_debug.cpu.nr_running.stddev
      1673 ± 16%   +1028.2%      18878 ±  7%  sched_debug.cpu.nr_switches.avg
      2564 ± 67%    +752.3%      21858 ±  4%  sched_debug.cpu.nr_switches.stddev
      0.00 ±111%   +6575.0%       0.12 ± 14%  sched_debug.cpu.nr_uninterruptible.avg
    -31.42          +179.6%     -87.83        sched_debug.cpu.nr_uninterruptible.min
      0.03 ±  7%    +349.1%       0.12 ±  8%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.03 ± 99%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      0.08 ±107%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
      0.01 ±  8%     +78.1%       0.01 ± 17%  perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.79 ± 29%     -99.2%       0.01 ±  9%  perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      0.60 ± 65%     -92.2%       0.05 ±192%  perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.03 ± 70%     -71.0%       0.01 ± 36%  perf-sched.sch_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
      0.23 ± 26%     -90.3%       0.02 ± 36%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.50 ± 42%     -86.5%       0.07 ±120%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      0.05 ± 45%     -83.5%       0.01 ±  6%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.06 ± 28%     -73.9%       0.02 ± 35%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.01 ± 10%    +136.7%       0.02 ± 46%  perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.26 ± 29%     -96.0%       0.01 ± 85%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      0.30 ± 40%     -95.2%       0.01 ± 21%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
      0.31 ± 41%     -95.3%       0.01 ± 20%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
      0.20 ± 39%     -96.3%       0.01 ± 25%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.01           +73.3%       0.01 ± 14%  perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      0.01 ± 32%     -39.7%       0.01 ±  6%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.21 ± 22%     -96.2%       0.01 ± 22%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      0.27 ±139%    -100.0%       0.00        perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      0.58 ± 90%    -100.0%       0.00        perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
      0.01 ± 21%    +204.2%       0.02 ± 75%  perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      3.39 ±  8%     -99.7%       0.01 ±  7%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      2.08 ± 55%     -92.1%       0.16 ±210%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      3.41 ±  7%     -90.9%       0.31 ± 97%  perf-sched.sch_delay.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
      6.55 ± 39%     -91.8%       0.54 ± 56%  perf-sched.sch_delay.max.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      2.46 ± 39%     -98.1%       0.05 ± 73%  perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.88 ± 49%     -92.6%       0.44 ± 88%  perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.01 ± 21%    +433.9%       0.06 ± 78%  perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      6.90 ± 37%    +311.9%      28.44 ±  8%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
      9.31 ± 16%    +223.8%      30.16 ± 11%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
      3.20 ±  6%     -97.0%       0.10 ± 75%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.01 ±  8%    +174.5%       0.02 ± 47%  perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      1.83 ± 20%     -95.6%       0.08 ± 21%  perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2.67 ± 38%     -98.4%       0.04 ± 82%  perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      0.18 ± 37%     -91.7%       0.01 ± 18%  perf-sched.total_sch_delay.average.ms
     10.22          +205.5%      31.22 ±  9%  perf-sched.total_sch_delay.max.ms
    108.64 ±  6%     -93.1%       7.50 ±  3%  perf-sched.total_wait_and_delay.average.ms
     12100 ±  7%   +1924.9%     245027 ±  4%  perf-sched.total_wait_and_delay.count.ms
      4980           -18.5%       4056 ±  8%  perf-sched.total_wait_and_delay.max.ms
    108.47 ±  6%     -93.1%       7.48 ±  3%  perf-sched.total_wait_time.average.ms
      4980           -18.5%       4056 ±  8%  perf-sched.total_wait_time.max.ms
      7.85           -92.1%       0.62 ±223%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
    589.78 ±  7%     +28.3%     756.97        perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      1.16 ± 28%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
    500.89           -74.7%     126.73 ± 19%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      1.20 ± 10%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      6.80 ±  4%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     47.83 ±  7%     -22.6%      37.00        perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
    109.67 ±  3%    -100.0%       0.00        perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
    602.00 ± 48%     -93.8%      37.33 ±223%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     24.00          +438.9%     129.33 ± 15%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      3099 ±  8%   +3753.2%     119435 ±  4%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
      3170 ±  8%   +3674.8%     119693 ±  4%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
     86.33          -100.0%       0.00        perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    712.00 ±  4%    -100.0%       0.00        perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      1707 ±  2%     +48.8%       2540 ± 19%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    449.67 ±  4%     +20.9%     543.67 ±  5%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      4980           -96.7%     166.80 ±223%  perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      5.85 ± 18%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13.96 ± 35%    +145.9%      34.34 ± 34%  perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
     18.63 ± 16%     +81.2%      33.75 ± 25%  perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
      7.17 ± 12%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    479.50 ±  8%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      7.82           -50.2%       3.90 ±  8%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.25 ±140%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      0.06 ±147%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
    589.55 ±  7%     +28.4%     756.94        perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.20 ±151%    +304.7%       0.82 ± 29%  perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      1.10 ± 31%     -70.5%       0.33 ±  9%  perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
    500.62           -74.7%     126.72 ± 19%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
      0.39 ± 43%    +156.4%       1.00 ± 17%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
      0.40 ± 39%    +150.1%       1.01 ± 17%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
      1.00 ±  8%     -41.3%       0.58 ±  5%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.21 ± 24%     -98.1%       0.00 ± 38%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      4980           -79.9%       1000        perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      1.15 ±108%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
      0.44 ± 96%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
      0.41 ±152%    +306.1%       1.65 ± 29%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      3.88 ±  7%     -66.4%       1.30 ± 27%  perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.36 ±  5%     -61.1%       2.09 ±  4%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      2.67 ± 38%     -98.7%       0.03 ± 71%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open



***************************************************************************************************
lkp-cml-d02: 20 threads 1 sockets (Commet Lake) with 16G memory
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
  cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-9.4/200%/debian-12-x86_64-20240206.cgz/900s/lkp-cml-d02/TCP_STREAM/netperf

commit: 
  570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
  cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")

570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      8399           +29.1%      10840 ± 13%  vmstat.system.cs
    974775 ±  5%     +12.9%    1100703 ± 12%  sched_debug.cpu.avg_idle.max
      8.99 ±  3%      +8.1%       9.72 ±  6%  sched_debug.cpu.clock.stddev
    208144           +24.9%     260012 ± 12%  sched_debug.cpu.nr_switches.avg
    166255           +16.0%     192901 ± 11%  sched_debug.cpu.nr_switches.stddev
     31290            +3.7%      32458 ±  2%  proc-vmstat.nr_shmem
 2.324e+08            -2.1%  2.274e+08        proc-vmstat.numa_hit
 2.324e+08            -2.1%  2.274e+08        proc-vmstat.numa_local
 1.851e+09            -2.1%  1.812e+09        proc-vmstat.pgalloc_normal
 1.851e+09            -2.1%  1.812e+09        proc-vmstat.pgfree
      1683            -2.2%       1647        netperf.ThroughputBoth_Mbps
     67336            -2.2%      65887        netperf.ThroughputBoth_total_Mbps
      1683            -2.2%       1647        netperf.Throughput_Mbps
     67336            -2.2%      65887        netperf.Throughput_total_Mbps
   2006974           +41.4%    2838711 ± 16%  netperf.time.involuntary_context_switches
 4.624e+08            -2.2%  4.524e+08        netperf.workload
    117.25            +1.7%     119.19        perf-stat.i.MPKI
 6.963e+08            -1.5%  6.858e+08        perf-stat.i.branch-instructions
      8356           +29.1%      10785 ± 13%  perf-stat.i.context-switches
     25.44            +1.5%      25.82        perf-stat.i.cpi
    117.06            +2.5%     119.93        perf-stat.i.cpu-migrations
  3.35e+09            -1.5%    3.3e+09        perf-stat.i.instructions
      0.04            -1.5%       0.04        perf-stat.i.ipc
    115.09            +1.7%     117.03        perf-stat.overall.MPKI
     24.97            +1.5%      25.35        perf-stat.overall.cpi
      0.04            -1.5%       0.04        perf-stat.overall.ipc
 6.954e+08            -1.5%  6.849e+08        perf-stat.ps.branch-instructions
      8346           +29.1%      10773 ± 13%  perf-stat.ps.context-switches
    116.93            +2.5%     119.85        perf-stat.ps.cpu-migrations
 3.346e+09            -1.5%  3.296e+09        perf-stat.ps.instructions
 3.021e+12            -1.5%  2.976e+12        perf-stat.total.instructions
      7.77 ±  4%     -24.0%       5.91 ± 16%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      5.10 ± 14%     -19.5%       4.11 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      6.21 ±  6%     -21.0%       4.91 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      6.78 ± 12%     -23.9%       5.16 ± 12%  perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      6.41 ±  3%     -21.6%       5.03 ± 15%  perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      1.14 ±119%    +155.9%       2.90 ± 51%  perf-sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.91           -21.7%       5.41 ± 13%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      4.82 ± 18%     -35.7%       3.10 ± 17%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      7.39 ±  4%     -25.4%       5.51 ± 13%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      1.65 ± 24%     +33.5%       2.21 ± 16%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      5.41           -17.6%       4.46 ± 11%  perf-sched.total_sch_delay.average.ms
     18.99           -16.5%      15.86 ± 11%  perf-sched.total_wait_and_delay.average.ms
     48294           +23.0%      59391 ± 12%  perf-sched.total_wait_and_delay.count.ms
     13.58           -16.0%      11.40 ± 10%  perf-sched.total_wait_time.average.ms
     15.56 ±  4%     -24.0%      11.83 ± 16%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
    102.17 ±  8%     -28.6%      72.96 ± 30%  perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     13.81 ±  5%     -20.3%      11.01 ± 12%  perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
     14.64           -21.1%      11.55 ± 13%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
    641.16 ±  6%      +7.6%     689.79 ±  5%  perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      4579 ±  6%     +44.6%       6622 ± 25%  perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
    101.80 ±  3%     +12.0%     114.00 ±  8%  perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     35.20 ± 15%     +35.4%      47.67 ± 29%  perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     23823           +31.1%      31228 ± 15%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      1456 ±  3%     +28.9%       1878 ± 18%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    764.24 ± 18%     -40.2%     457.29 ± 29%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      7.79 ±  4%     -24.0%       5.92 ± 16%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      7.60 ±  5%     -19.7%       6.10 ± 12%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      6.81 ± 12%     -23.3%       5.22 ± 13%  perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      8.41 ±  3%     -22.1%       6.55 ± 14%  perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      7.73           -20.6%       6.14 ± 12%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.04 ± 17%     -28.6%       7.17 ±  9%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    641.09 ±  6%      +7.5%     689.00 ±  5%  perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
    764.17 ± 18%     -40.2%     457.17 ± 29%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     41.85            -0.3       41.58        perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
      0.64 ±  8%      -0.3        0.38 ± 71%  perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
      0.62 ±  3%      -0.2        0.38 ± 70%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
      0.60 ±  3%      -0.2        0.37 ± 70%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
     34.37            -0.2       34.15        perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
     33.61            -0.2       33.39        perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
     34.89            -0.2       34.74        perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      0.76 ±  7%      -0.1        0.64 ± 12%  perf-profile.calltrace.cycles-pp.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     99.42            -0.1       99.36        perf-profile.calltrace.cycles-pp.main
      2.42 ±  2%      +0.1        2.49        perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
      2.43 ±  2%      +0.1        2.50        perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
      3.86 ±  2%      +0.1        3.96        perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
      3.88            +0.1        3.98        perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
      3.88            +0.1        3.98        perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
      3.99 ±  2%      +0.1        4.10        perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs
      4.02 ±  2%      +0.1        4.13        perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq
      4.02 ±  2%      +0.1        4.13        perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip
     54.53            +0.2       54.68        perf-profile.calltrace.cycles-pp.recv.recv_omni.process_requests.spawn_child.accept_connection
     54.44            +0.2       54.60        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests
     54.44            +0.2       54.60        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests.spawn_child
     54.77            +0.2       54.94        perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main
     54.77            +0.2       54.94        perf-profile.calltrace.cycles-pp.accept_connections.main
     54.77            +0.2       54.94        perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main
     54.77            +0.2       54.94        perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main
     54.77            +0.2       54.94        perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections
     41.87            -0.3       41.60        perf-profile.children.cycles-pp.tcp_sendmsg_locked
     34.39            -0.2       34.16        perf-profile.children.cycles-pp._copy_from_iter
     34.91            -0.2       34.76        perf-profile.children.cycles-pp.skb_do_copy_data_nocache
      1.38 ±  4%      -0.1        1.25 ±  4%  perf-profile.children.cycles-pp.napi_consume_skb
      0.86 ±  3%      -0.1        0.74 ± 12%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      3.55            -0.1        3.43 ±  2%  perf-profile.children.cycles-pp.skb_release_data
     99.55            -0.1       99.50        perf-profile.children.cycles-pp.main
     98.61            -0.0       98.56        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.06 ±  6%      +0.0        0.08 ± 14%  perf-profile.children.cycles-pp.switch_fpu_return
      0.33 ±  3%      +0.1        0.38 ±  8%  perf-profile.children.cycles-pp.schedule
      0.35 ±  2%      +0.1        0.41 ±  9%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      0.39 ±  4%      +0.1        0.46 ± 11%  perf-profile.children.cycles-pp.__schedule
     54.77            +0.2       54.94        perf-profile.children.cycles-pp.accept_connection
     54.77            +0.2       54.94        perf-profile.children.cycles-pp.accept_connections
     54.77            +0.2       54.94        perf-profile.children.cycles-pp.process_requests
     54.77            +0.2       54.94        perf-profile.children.cycles-pp.spawn_child
     54.77            +0.2       54.94        perf-profile.children.cycles-pp.recv_omni
     54.63            +0.2       54.80        perf-profile.children.cycles-pp.recv
      0.86 ±  3%      -0.1        0.74 ± 12%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      1.12            -0.0        1.09        perf-profile.self.cycles-pp.__free_frozen_pages
      0.15 ±  2%      +0.0        0.18 ± 12%  perf-profile.self.cycles-pp.__rmqueue_pcplist
      0.22 ± 11%      +0.0        0.27 ±  6%  perf-profile.self.cycles-pp.__check_object_size



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-08-07  8:34 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-07  8:34 [linus:master] [sched/deadline] cccb45d7c4: stress-ng.netdev.ops_per_sec 61.6% regression kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.