From: kernel test robot <oliver.sang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, Chris Mason <clm@meta.com>,
Juri Lelli <juri.lelli@redhat.com>, <aubrey.li@linux.intel.com>,
<yu.c.chen@intel.com>, <oliver.sang@intel.com>
Subject: [linus:master] [sched/deadline] cccb45d7c4: stress-ng.netdev.ops_per_sec 61.6% regression
Date: Thu, 7 Aug 2025 16:34:21 +0800 [thread overview]
Message-ID: <202508071007.7b2e45c0-lkp@intel.com> (raw)
Hello,
besides the regressions (and improvements) we reported as
"[tip:sched/core] [sched/deadline] cccb45d7c4: will-it-scale.per_thread_ops 36.7% regression"
in
https://lore.kernel.org/all/202507230755.5fe8e03e-lkp@intel.com/
now we captured 2 more regressions when this commit is in mainline. just FYI
kernel test robot noticed a 61.6% regression of stress-ng.netdev.ops_per_sec on:
commit: cccb45d7c4295bbfeba616582d0249f2d21e6df5 ("sched/deadline: Less agressive dl_server handling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master 7e161a991ea71e6ec526abc8f40c6852ebe3d946]
[still regression on linux-next/master afec768a6a8fe7fb02a08ffce5f2f556f51d4b52]
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: netdev
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps -2.2% regression |
| test machine | 20 threads 1 sockets (Commet Lake) with 16G memory |
| test parameters | cluster=cs-localhost |
| | cpufreq_governor=performance |
| | ip=ipv4 |
| | nr_threads=200% |
| | runtime=900s |
| | test=TCP_STREAM |
+------------------+----------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202508071007.7b2e45c0-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250807/202508071007.7b2e45c0-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp3/netdev/stress-ng/60s
commit:
570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")
570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.63e+08 +643.3% 2.698e+09 ± 13% cpuidle..time
204743 ± 3% +1975.4% 4249334 ± 10% cpuidle..usage
4.97 +341.0% 21.93 ± 10% vmstat.cpu.id
184.63 -22.2% 143.65 ± 4% vmstat.procs.r
3473 +3085.4% 110658 ± 8% vmstat.system.cs
408964 +2.0% 417266 vmstat.system.in
1113721 +17.7% 1310862 meminfo.Active
1113721 +17.7% 1310862 meminfo.Active(anon)
177366 ± 2% +11.0% 196822 ± 4% meminfo.DirectMap4k
298665 +19.2% 355990 ± 2% meminfo.Mapped
420393 +45.3% 610884 ± 2% meminfo.Shmem
2.96 ± 20% +16.7 19.66 ± 14% mpstat.cpu.all.idle%
0.26 +0.3 0.61 ± 3% mpstat.cpu.all.irq%
0.00 ± 20% +0.0 0.03 ± 5% mpstat.cpu.all.soft%
96.53 -17.1 79.47 ± 3% mpstat.cpu.all.sys%
14.83 ± 61% +238.2% 50.17 ± 20% mpstat.max_utilization.seconds
100.00 -14.6% 85.36 ± 3% mpstat.max_utilization_pct
9547602 -61.6% 3667923 ± 4% stress-ng.netdev.ops
159180 -61.6% 61151 ± 4% stress-ng.netdev.ops_per_sec
67355 +2.8% 69256 stress-ng.time.minor_page_faults
19016 -21.0% 15021 ± 4% stress-ng.time.percent_of_cpu_this_job_got
11432 -21.0% 9033 ± 4% stress-ng.time.system_time
35368 ± 2% +9542.0% 3410177 ± 8% stress-ng.time.voluntary_context_switches
278515 +17.5% 327222 proc-vmstat.nr_active_anon
995358 +4.7% 1042520 proc-vmstat.nr_file_pages
74999 +18.3% 88740 ± 2% proc-vmstat.nr_mapped
105146 +44.9% 152305 ± 2% proc-vmstat.nr_shmem
278515 +17.5% 327222 proc-vmstat.nr_zone_active_anon
826913 +7.7% 890858 proc-vmstat.numa_hit
629070 +10.1% 692863 proc-vmstat.numa_local
873883 +7.2% 936679 proc-vmstat.pgalloc_normal
418067 +2.9% 430228 proc-vmstat.pgfault
0.10 ± 3% +37.8% 0.14 ± 2% perf-stat.i.MPKI
2.248e+10 -22.9% 1.733e+10 ± 4% perf-stat.i.branch-instructions
0.10 ± 2% +0.0 0.15 ± 4% perf-stat.i.branch-miss-rate%
18947128 +20.6% 22857416 perf-stat.i.branch-misses
35.42 -17.1 18.35 ± 9% perf-stat.i.cache-miss-rate%
9364646 +11.9% 10482390 ± 2% perf-stat.i.cache-misses
27205535 +125.0% 61210467 ± 11% perf-stat.i.cache-references
3273 ± 2% +3392.2% 114320 ± 8% perf-stat.i.context-switches
5.35 +3.8% 5.56 perf-stat.i.cpi
6.028e+11 -20.1% 4.818e+11 ± 4% perf-stat.i.cpu-cycles
327.85 +343.1% 1452 ± 9% perf-stat.i.cpu-migrations
68905 -27.8% 49741 ± 2% perf-stat.i.cycles-between-cache-misses
1.12e+11 -23.0% 8.626e+10 ± 4% perf-stat.i.instructions
0.19 -3.5% 0.18 perf-stat.i.ipc
4316 ± 2% +6.1% 4578 perf-stat.i.minor-faults
4316 ± 2% +6.1% 4578 perf-stat.i.page-faults
0.08 +45.3% 0.12 ± 2% perf-stat.overall.MPKI
0.08 +0.0 0.13 ± 5% perf-stat.overall.branch-miss-rate%
34.42 -17.1 17.28 ± 9% perf-stat.overall.cache-miss-rate%
5.38 +3.8% 5.59 perf-stat.overall.cpi
64384 -28.5% 46017 ± 2% perf-stat.overall.cycles-between-cache-misses
0.19 -3.7% 0.18 perf-stat.overall.ipc
2.211e+10 -22.9% 1.705e+10 ± 4% perf-stat.ps.branch-instructions
18642811 +20.5% 22455956 perf-stat.ps.branch-misses
9210208 +11.8% 10296398 ± 2% perf-stat.ps.cache-misses
26761745 +124.9% 60190009 ± 11% perf-stat.ps.cache-references
3220 ± 2% +3391.4% 112425 ± 8% perf-stat.ps.context-switches
5.93e+11 -20.1% 4.739e+11 ± 4% perf-stat.ps.cpu-cycles
322.54 +343.0% 1428 ± 9% perf-stat.ps.cpu-migrations
1.102e+11 -23.0% 8.484e+10 ± 4% perf-stat.ps.instructions
4239 ± 2% +5.3% 4464 perf-stat.ps.minor-faults
4239 ± 2% +5.3% 4464 perf-stat.ps.page-faults
6.771e+12 -23.7% 5.169e+12 ± 4% perf-stat.total.instructions
5992277 -35.8% 3846765 ± 8% sched_debug.cfs_rq:/.avg_vruntime.avg
6049811 -19.2% 4888185 ± 5% sched_debug.cfs_rq:/.avg_vruntime.max
5847973 -63.4% 2140155 ± 6% sched_debug.cfs_rq:/.avg_vruntime.min
30248 ± 13% +3774.5% 1171963 ± 2% sched_debug.cfs_rq:/.avg_vruntime.stddev
0.53 -21.2% 0.42 ± 4% sched_debug.cfs_rq:/.h_nr_queued.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_queued.min
0.17 ± 10% +99.4% 0.34 ± 3% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.53 -21.3% 0.42 ± 4% sched_debug.cfs_rq:/.h_nr_runnable.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_runnable.min
0.17 ± 10% +98.9% 0.34 ± 3% sched_debug.cfs_rq:/.h_nr_runnable.stddev
2696 -100.0% 0.00 sched_debug.cfs_rq:/.load.min
2.50 -83.3% 0.42 ±107% sched_debug.cfs_rq:/.load_avg.min
5992277 -35.8% 3846765 ± 8% sched_debug.cfs_rq:/.min_vruntime.avg
6049811 -19.2% 4888185 ± 5% sched_debug.cfs_rq:/.min_vruntime.max
5847973 -63.4% 2140155 ± 6% sched_debug.cfs_rq:/.min_vruntime.min
30248 ± 13% +3774.5% 1171963 ± 2% sched_debug.cfs_rq:/.min_vruntime.stddev
0.53 -21.2% 0.42 ± 4% sched_debug.cfs_rq:/.nr_queued.avg
0.50 -100.0% 0.00 sched_debug.cfs_rq:/.nr_queued.min
0.12 ± 8% +185.0% 0.33 ± 4% sched_debug.cfs_rq:/.nr_queued.stddev
588.21 -17.7% 484.36 ± 3% sched_debug.cfs_rq:/.runnable_avg.avg
489.25 ± 6% -95.0% 24.25 ±141% sched_debug.cfs_rq:/.runnable_avg.min
136.65 ± 9% +70.5% 233.00 ± 4% sched_debug.cfs_rq:/.runnable_avg.stddev
585.65 -17.5% 482.95 ± 3% sched_debug.cfs_rq:/.util_avg.avg
410.58 ± 29% -94.4% 23.00 ±141% sched_debug.cfs_rq:/.util_avg.min
117.24 ± 7% +99.4% 233.84 ± 4% sched_debug.cfs_rq:/.util_avg.stddev
520.05 -32.1% 353.31 ± 6% sched_debug.cfs_rq:/.util_est.avg
1139 ± 14% -19.1% 921.17 ± 11% sched_debug.cfs_rq:/.util_est.max
387.58 ± 45% -100.0% 0.00 sched_debug.cfs_rq:/.util_est.min
67.01 ± 18% +283.1% 256.74 ± 2% sched_debug.cfs_rq:/.util_est.stddev
669274 ± 19% +60.4% 1073556 ± 6% sched_debug.cpu.avg_idle.avg
1885708 ± 21% +44.0% 2714848 ± 7% sched_debug.cpu.avg_idle.max
7213 ± 87% +597.3% 50301 ± 5% sched_debug.cpu.avg_idle.min
16.82 ± 12% -35.5% 10.86 ± 8% sched_debug.cpu.clock.stddev
2573 -21.9% 2010 ± 4% sched_debug.cpu.curr->pid.avg
2303 ± 12% -100.0% 0.00 sched_debug.cpu.curr->pid.min
448.26 ± 8% +220.4% 1436 ± 4% sched_debug.cpu.curr->pid.stddev
235851 ± 10% +17.7% 277484 ± 9% sched_debug.cpu.max_idle_balance_cost.stddev
0.53 -21.5% 0.42 ± 4% sched_debug.cpu.nr_running.avg
0.50 -100.0% 0.00 sched_debug.cpu.nr_running.min
0.16 ± 11% +108.2% 0.33 ± 3% sched_debug.cpu.nr_running.stddev
1673 ± 16% +1028.2% 18878 ± 7% sched_debug.cpu.nr_switches.avg
2564 ± 67% +752.3% 21858 ± 4% sched_debug.cpu.nr_switches.stddev
0.00 ±111% +6575.0% 0.12 ± 14% sched_debug.cpu.nr_uninterruptible.avg
-31.42 +179.6% -87.83 sched_debug.cpu.nr_uninterruptible.min
0.03 ± 7% +349.1% 0.12 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.03 ± 99% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.08 ±107% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.01 ± 8% +78.1% 0.01 ± 17% perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.79 ± 29% -99.2% 0.01 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.60 ± 65% -92.2% 0.05 ±192% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.03 ± 70% -71.0% 0.01 ± 36% perf-sched.sch_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
0.23 ± 26% -90.3% 0.02 ± 36% perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.50 ± 42% -86.5% 0.07 ±120% perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
0.05 ± 45% -83.5% 0.01 ± 6% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.06 ± 28% -73.9% 0.02 ± 35% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ± 10% +136.7% 0.02 ± 46% perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.26 ± 29% -96.0% 0.01 ± 85% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
0.30 ± 40% -95.2% 0.01 ± 21% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
0.31 ± 41% -95.3% 0.01 ± 20% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
0.20 ± 39% -96.3% 0.01 ± 25% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.01 +73.3% 0.01 ± 14% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
0.01 ± 32% -39.7% 0.01 ± 6% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.21 ± 22% -96.2% 0.01 ± 22% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.27 ±139% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.58 ± 90% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.01 ± 21% +204.2% 0.02 ± 75% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
3.39 ± 8% -99.7% 0.01 ± 7% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
2.08 ± 55% -92.1% 0.16 ±210% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
3.41 ± 7% -90.9% 0.31 ± 97% perf-sched.sch_delay.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
6.55 ± 39% -91.8% 0.54 ± 56% perf-sched.sch_delay.max.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
2.46 ± 39% -98.1% 0.05 ± 73% perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.88 ± 49% -92.6% 0.44 ± 88% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ± 21% +433.9% 0.06 ± 78% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
6.90 ± 37% +311.9% 28.44 ± 8% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
9.31 ± 16% +223.8% 30.16 ± 11% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
3.20 ± 6% -97.0% 0.10 ± 75% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.01 ± 8% +174.5% 0.02 ± 47% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
1.83 ± 20% -95.6% 0.08 ± 21% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
2.67 ± 38% -98.4% 0.04 ± 82% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.18 ± 37% -91.7% 0.01 ± 18% perf-sched.total_sch_delay.average.ms
10.22 +205.5% 31.22 ± 9% perf-sched.total_sch_delay.max.ms
108.64 ± 6% -93.1% 7.50 ± 3% perf-sched.total_wait_and_delay.average.ms
12100 ± 7% +1924.9% 245027 ± 4% perf-sched.total_wait_and_delay.count.ms
4980 -18.5% 4056 ± 8% perf-sched.total_wait_and_delay.max.ms
108.47 ± 6% -93.1% 7.48 ± 3% perf-sched.total_wait_time.average.ms
4980 -18.5% 4056 ± 8% perf-sched.total_wait_time.max.ms
7.85 -92.1% 0.62 ±223% perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
589.78 ± 7% +28.3% 756.97 perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
1.16 ± 28% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
500.89 -74.7% 126.73 ± 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
1.20 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
6.80 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
47.83 ± 7% -22.6% 37.00 perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
109.67 ± 3% -100.0% 0.00 perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
602.00 ± 48% -93.8% 37.33 ±223% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
24.00 +438.9% 129.33 ± 15% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
3099 ± 8% +3753.2% 119435 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
3170 ± 8% +3674.8% 119693 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
86.33 -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
712.00 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
1707 ± 2% +48.8% 2540 ± 19% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
449.67 ± 4% +20.9% 543.67 ± 5% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4980 -96.7% 166.80 ±223% perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
5.85 ± 18% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.96 ± 35% +145.9% 34.34 ± 34% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
18.63 ± 16% +81.2% 33.75 ± 25% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
7.17 ± 12% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
479.50 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
7.82 -50.2% 3.90 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.25 ±140% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.06 ±147% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
589.55 ± 7% +28.4% 756.94 perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.20 ±151% +304.7% 0.82 ± 29% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1.10 ± 31% -70.5% 0.33 ± 9% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
500.62 -74.7% 126.72 ± 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
0.39 ± 43% +156.4% 1.00 ± 17% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.dev_ifconf
0.40 ± 39% +150.1% 1.01 ± 17% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.devinet_ioctl
1.00 ± 8% -41.3% 0.58 ± 5% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.21 ± 24% -98.1% 0.00 ± 38% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
4980 -79.9% 1000 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
1.15 ±108% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.44 ± 96% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.41 ±152% +306.1% 1.65 ± 29% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
3.88 ± 7% -66.4% 1.30 ± 27% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.36 ± 5% -61.1% 2.09 ± 4% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
2.67 ± 38% -98.7% 0.03 ± 71% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
***************************************************************************************************
lkp-cml-d02: 20 threads 1 sockets (Commet Lake) with 16G memory
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-9.4/200%/debian-12-x86_64-20240206.cgz/900s/lkp-cml-d02/TCP_STREAM/netperf
commit:
570c8efd5e ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
cccb45d7c4 ("sched/deadline: Less agressive dl_server handling")
570c8efd5eb79c37 cccb45d7c4295bbfeba616582d0
---------------- ---------------------------
%stddev %change %stddev
\ | \
8399 +29.1% 10840 ± 13% vmstat.system.cs
974775 ± 5% +12.9% 1100703 ± 12% sched_debug.cpu.avg_idle.max
8.99 ± 3% +8.1% 9.72 ± 6% sched_debug.cpu.clock.stddev
208144 +24.9% 260012 ± 12% sched_debug.cpu.nr_switches.avg
166255 +16.0% 192901 ± 11% sched_debug.cpu.nr_switches.stddev
31290 +3.7% 32458 ± 2% proc-vmstat.nr_shmem
2.324e+08 -2.1% 2.274e+08 proc-vmstat.numa_hit
2.324e+08 -2.1% 2.274e+08 proc-vmstat.numa_local
1.851e+09 -2.1% 1.812e+09 proc-vmstat.pgalloc_normal
1.851e+09 -2.1% 1.812e+09 proc-vmstat.pgfree
1683 -2.2% 1647 netperf.ThroughputBoth_Mbps
67336 -2.2% 65887 netperf.ThroughputBoth_total_Mbps
1683 -2.2% 1647 netperf.Throughput_Mbps
67336 -2.2% 65887 netperf.Throughput_total_Mbps
2006974 +41.4% 2838711 ± 16% netperf.time.involuntary_context_switches
4.624e+08 -2.2% 4.524e+08 netperf.workload
117.25 +1.7% 119.19 perf-stat.i.MPKI
6.963e+08 -1.5% 6.858e+08 perf-stat.i.branch-instructions
8356 +29.1% 10785 ± 13% perf-stat.i.context-switches
25.44 +1.5% 25.82 perf-stat.i.cpi
117.06 +2.5% 119.93 perf-stat.i.cpu-migrations
3.35e+09 -1.5% 3.3e+09 perf-stat.i.instructions
0.04 -1.5% 0.04 perf-stat.i.ipc
115.09 +1.7% 117.03 perf-stat.overall.MPKI
24.97 +1.5% 25.35 perf-stat.overall.cpi
0.04 -1.5% 0.04 perf-stat.overall.ipc
6.954e+08 -1.5% 6.849e+08 perf-stat.ps.branch-instructions
8346 +29.1% 10773 ± 13% perf-stat.ps.context-switches
116.93 +2.5% 119.85 perf-stat.ps.cpu-migrations
3.346e+09 -1.5% 3.296e+09 perf-stat.ps.instructions
3.021e+12 -1.5% 2.976e+12 perf-stat.total.instructions
7.77 ± 4% -24.0% 5.91 ± 16% perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
5.10 ± 14% -19.5% 4.11 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
6.21 ± 6% -21.0% 4.91 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
6.78 ± 12% -23.9% 5.16 ± 12% perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
6.41 ± 3% -21.6% 5.03 ± 15% perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
1.14 ±119% +155.9% 2.90 ± 51% perf-sched.sch_delay.avg.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.91 -21.7% 5.41 ± 13% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
4.82 ± 18% -35.7% 3.10 ± 17% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
7.39 ± 4% -25.4% 5.51 ± 13% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
1.65 ± 24% +33.5% 2.21 ± 16% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
5.41 -17.6% 4.46 ± 11% perf-sched.total_sch_delay.average.ms
18.99 -16.5% 15.86 ± 11% perf-sched.total_wait_and_delay.average.ms
48294 +23.0% 59391 ± 12% perf-sched.total_wait_and_delay.count.ms
13.58 -16.0% 11.40 ± 10% perf-sched.total_wait_time.average.ms
15.56 ± 4% -24.0% 11.83 ± 16% perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
102.17 ± 8% -28.6% 72.96 ± 30% perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
13.81 ± 5% -20.3% 11.01 ± 12% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
14.64 -21.1% 11.55 ± 13% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
641.16 ± 6% +7.6% 689.79 ± 5% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4579 ± 6% +44.6% 6622 ± 25% perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
101.80 ± 3% +12.0% 114.00 ± 8% perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
35.20 ± 15% +35.4% 47.67 ± 29% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
23823 +31.1% 31228 ± 15% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
1456 ± 3% +28.9% 1878 ± 18% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
764.24 ± 18% -40.2% 457.29 ± 29% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
7.79 ± 4% -24.0% 5.92 ± 16% perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
7.60 ± 5% -19.7% 6.10 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
6.81 ± 12% -23.3% 5.22 ± 13% perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
8.41 ± 3% -22.1% 6.55 ± 14% perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
7.73 -20.6% 6.14 ± 12% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.04 ± 17% -28.6% 7.17 ± 9% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
641.09 ± 6% +7.5% 689.00 ± 5% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
764.17 ± 18% -40.2% 457.17 ± 29% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
41.85 -0.3 41.58 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
0.64 ± 8% -0.3 0.38 ± 71% perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
0.62 ± 3% -0.2 0.38 ± 70% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
0.60 ± 3% -0.2 0.37 ± 70% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
34.37 -0.2 34.15 perf-profile.calltrace.cycles-pp._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto
33.61 -0.2 33.39 perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_iter.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg
34.89 -0.2 34.74 perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
0.76 ± 7% -0.1 0.64 ± 12% perf-profile.calltrace.cycles-pp.tcp_write_xmit.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
99.42 -0.1 99.36 perf-profile.calltrace.cycles-pp.main
2.42 ± 2% +0.1 2.49 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
2.43 ± 2% +0.1 2.50 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_protocol_deliver_rcu
3.86 ± 2% +0.1 3.96 perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
3.88 +0.1 3.98 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
3.88 +0.1 3.98 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
3.99 ± 2% +0.1 4.10 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs
4.02 ± 2% +0.1 4.13 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq
4.02 ± 2% +0.1 4.13 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip
54.53 +0.2 54.68 perf-profile.calltrace.cycles-pp.recv.recv_omni.process_requests.spawn_child.accept_connection
54.44 +0.2 54.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests
54.44 +0.2 54.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recv.recv_omni.process_requests.spawn_child
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main
54.77 +0.2 54.94 perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections
41.87 -0.3 41.60 perf-profile.children.cycles-pp.tcp_sendmsg_locked
34.39 -0.2 34.16 perf-profile.children.cycles-pp._copy_from_iter
34.91 -0.2 34.76 perf-profile.children.cycles-pp.skb_do_copy_data_nocache
1.38 ± 4% -0.1 1.25 ± 4% perf-profile.children.cycles-pp.napi_consume_skb
0.86 ± 3% -0.1 0.74 ± 12% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
3.55 -0.1 3.43 ± 2% perf-profile.children.cycles-pp.skb_release_data
99.55 -0.1 99.50 perf-profile.children.cycles-pp.main
98.61 -0.0 98.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.06 ± 6% +0.0 0.08 ± 14% perf-profile.children.cycles-pp.switch_fpu_return
0.33 ± 3% +0.1 0.38 ± 8% perf-profile.children.cycles-pp.schedule
0.35 ± 2% +0.1 0.41 ± 9% perf-profile.children.cycles-pp.exit_to_user_mode_loop
0.39 ± 4% +0.1 0.46 ± 11% perf-profile.children.cycles-pp.__schedule
54.77 +0.2 54.94 perf-profile.children.cycles-pp.accept_connection
54.77 +0.2 54.94 perf-profile.children.cycles-pp.accept_connections
54.77 +0.2 54.94 perf-profile.children.cycles-pp.process_requests
54.77 +0.2 54.94 perf-profile.children.cycles-pp.spawn_child
54.77 +0.2 54.94 perf-profile.children.cycles-pp.recv_omni
54.63 +0.2 54.80 perf-profile.children.cycles-pp.recv
0.86 ± 3% -0.1 0.74 ± 12% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.12 -0.0 1.09 perf-profile.self.cycles-pp.__free_frozen_pages
0.15 ± 2% +0.0 0.18 ± 12% perf-profile.self.cycles-pp.__rmqueue_pcplist
0.22 ± 11% +0.0 0.27 ± 6% perf-profile.self.cycles-pp.__check_object_size
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-08-07 8:34 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202508071007.7b2e45c0-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=aubrey.li@linux.intel.com \
--cc=clm@meta.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.